agents-shipgate vs runtime guardrails: where each one fits

A common question once a team starts thinking about agent safety is: we already have a guardrail layer (or LLM gateway, or policy engine) — why do we need a release gate too?

Short answer: they cover different slices. Removing either one is a regression.

This post walks through where each fits, what each catches, and why running both is the standard configuration once an agent reaches production.

What a runtime guardrail does

A runtime guardrail (NVIDIA NeMo Guardrails, Guardrails AI, an LLM gateway like Portkey or LiteLLM with policies, or a hand-rolled middleware) sits between the model and the action. It inspects each tool call before it fires:

Is the tool on the allowlist for this agent or this user?
Does the input pass a regex/PII filter?
Has the user consumed their per-day refund quota?
Is the call happening during a maintenance freeze?
Does the model output match a structured-output schema?

When the guardrail blocks, the action doesn’t fire. When it allows, the action does. This is reactive control: the cost of “wrong” is lower because the gateway is the brake pedal.

Runtime guardrails are necessary. They catch things static analysis fundamentally can’t see — the actual user input, the current quota, the real-time state.

What a release gate does

agents-shipgate sits between the PR and production. It inspects the artifact diff in CI:

Is every destructive tool listed in the approval policy?
Are auth scopes narrower than the agent’s declared purpose?
Does the system prompt contradict the tool surface?
Is the input schema strict enough to bound the model’s actions?
Does a financial action have idempotency evidence?
Is a wildcard MCP source attached without an explicit allowlist?

When a release gate blocks, the merge doesn’t happen. When it allows, the change reaches production. This is preventive control: the cost of “wrong” is lower because broken artifacts never ship.

Release gates are also necessary. They catch things runtime guardrails fundamentally can’t see — the surface as a whole, the manifest, the prompt/tool alignment, the absence of a policy.

What each catches that the other can’t

Concrete failure modes, paired:

Failure	Caught by guardrail	Caught by release gate
User submits malicious input	✓	–
Refund quota exceeded for this user	✓	–
Tool call during planned maintenance	✓	–
New `delete_user` tool added with no approval policy	–	✓
OAuth scope grant widened in this PR	–	✓
Wildcard MCP source attached	–	✓
Prompt says “advise only” but write tools enabled	–	✓
Schema allows arbitrary `additionalProperties`	–	✓
Idempotency missing on a retry-eligible financial action	–	✓

The two columns don’t overlap. A guardrail will not catch a missing approval policy because it has no notion of approval policies in the abstract — only “is this call allowed right now.” A release gate will not catch a malicious user input because it never sees user input.

Why running both is the standard configuration

Without the release gate, the guardrail’s allowlist is whatever the team remembered to update. New tools get added; nobody updates the allowlist; the guardrail allows-by-default or denies-by-default and neither answer is right.

Without the guardrail, the release gate’s approval policy is a piece of paper. The actual call still fires when the model decides; nothing checks the runtime conditions the gate didn’t (and can’t) know about.

Running both means:

The release gate enforces that an approval policy exists and covers the right tools.
The guardrail enforces that the policy fires at runtime — the approval token is checked, the quota is consumed, the maintenance window is honored.

Each layer’s invariants depend on the other layer doing its job.

Where each one lives in the stack

   Code in PR ──────►  Release gate (agents-shipgate)
                            │ static, deterministic, fails on bad surface
                            ▼
   Merge ────────────► Production
                            │
   User request ───────►  LLM ──► Tool call ──►  Runtime guardrail
                                                     │ allow / deny / inspect
                                                     ▼
                                                External system

The release gate is upstream — its outputs feed CI status checks and PR comments. The guardrail is in the request path — its outputs feed production responses and observability.

Practical recommendation

If your team has a guardrail and is wondering whether the release gate is redundant: it isn’t. The guardrail’s allowlist gets stale unless something forces it to be reviewed against the surface that ships. The release gate is that something.

If your team has a release gate and is wondering whether the guardrail is overkill: it isn’t. The release gate cannot see the runtime state that should gate certain calls (quotas, time-of-day, current user identity). Static checks don’t replace runtime control.

For the upstream argument that the tool surface is itself a release artifact, see Your AI agent has a tool surface. It needs a release gate.. For why evals are also not a substitute, see Why evals are not release gates.