Agents Shipgate Agent Release Readiness

Static release checks for tool-using AI agents.

Open-source CLI and GitHub Action that scans MCP exports, OpenAPI specs, and SDK metadata, and writes a Tool-Use Readiness Report before your agent gets production-like tools.

  • Static scan
  • No agent execution
  • No LLM calls
  • No telemetry
agents-shipgate-reports / report.md
Tool-Use Readiness Report
v0.5.1
2
critical
14
high
2
medium
recommended
human review
Top findings
#01 Critical openapi/billing.yaml:142
stripe.create_refund lacks a declared approval policy
#03 High shipgate.yaml:18
wildcard_mcp_tools.* exposes an unreviewable tool surface

Run a fixture, scan your repo, then add advisory CI.

A 60-second path to a real Tool-Use Readiness Report once Python and pipx are available. Requires Python 3.12+.

Known fixture
$ pipx install agents-shipgate
$ agents-shipgate fixture run support_refund_agent

Runs a bundled refund-support fixture and writes local report artifacts.

Scan your repo
$ agents-shipgate init --workspace . --write
$ agents-shipgate scan -c shipgate.yaml

Creates a manifest, reads local tool sources, and generates a report for review.

Advisory CI
- uses: ThreeMoonsLab/agents-shipgate@v0.5.1
  with:
    config: shipgate.yaml
    ci_mode: advisory

Adds PR evidence without failing builds while your team triages findings.

Default artifacts agents-shipgate-reports/report.md agents-shipgate-reports/report.json SARIF is available through the GitHub Action or scan output format configuration.

Turn your agent's tool surface into release evidence.

Agents Shipgate reads declared local tool sources, normalizes them into a reviewable inventory, runs deterministic static checks, and writes a Tool-Use Readiness Report.

CompanyThree Moons Lab
ProductAgents Shipgate
CLI / package / repoagents-shipgate
OutputTool-Use Readiness Report
Inputs
MCP exports
OpenAPI specs
SDK/framework metadata
OpenAI Agents SDK · Anthropic Messages API · Google ADK · LangChain/LangGraph · CrewAI · OpenAI Agents API
agents-shipgate
static · local · deterministic
Outputs
Markdown report
JSON report
GitHub Action summary
Optional SARIF for GitHub code scanning workflows.
Agent buildersReview MCP, OpenAPI, SDK, and framework tools before promotion.
Platform teamsTrack approval, scope, idempotency, and baseline drift in PRs.
Security reviewersGet static release evidence without running agents or importing user code.

What it checks before release.

The landing page shows the release-review categories; the full check catalog stays in the repo.

01approval

Approval gaps

Write, destructive, financial, or external communication tools without declared approval policies.

02surface

Wildcard tool sources

Wildcard MCP or inventory sources that expose an unreviewable tool surface.

03auth

Broad scopes

Manifest or tool permissions that rely on wildcard or overly broad authorization scopes.

04schema

Free-form action fields

Fields such as body, command, action, or updates that let the model control too much.

05bounds

Missing bounds

Unbounded arrays, objects, strings, or numeric fields on side-effecting operations.

06retry

Idempotency gaps

Write actions where retry behavior could duplicate refunds, updates, messages, or deletes.

07static

Dynamic surfaces

Framework toolsets that cannot be statically reviewed without explicit inventory evidence.

08review

Owner and policy evidence

Tools missing reviewer-friendly ownership, scope, approval, or prohibited-action coverage.

09baseline

Baseline drift

New, matched, and resolved findings when a reviewed baseline is present.

Reports release owners can actually review.

Findings include severity, evidence, source reference, confidence, and a recommended next action — built for engineering and platform review, not for dashboards.

agents-shipgate-reports / report.md
Tool-Use Readiness Report
Release blockers
Critical
2
High
14
Medium
2
Human review
recommended
Top findings
#01 Critical openapi/billing.yaml:142
stripe.create_refund lacks a declared approval policy
evidence: financial_action · external_write · POST /refunds
recommend: Add approval policy or remove from this release.
confidence: high · check_id: SHIP-POLICY-APPROVAL-MISSING
#02 Critical openapi/billing.yaml:142
stripe.create_refund lacks idempotency evidence
evidence: write action · amount/currency/payment_id schema · retry behavior unknown
recommend: Add idempotency key or document retry policy.
confidence: high · check_id: SHIP-IDEMPOTENCY-MISSING
#03 High shipgate.yaml:18
wildcard_mcp_tools.* exposes an unreviewable tool surface
evidence: wildcard tool source in shipgate.yaml
recommend: Replace wildcard with explicit allowlist.
confidence: high · check_id: SHIP-SOURCE-WILDCARD
#04 High mcp_tools.json:#/tools/send_email
support.send_email accepts free-form 'body' field
evidence: external_communication · no template binding
recommend: Constrain to template IDs or require human confirmation.
confidence: medium · check_id: SHIP-SCHEMA-FREEFORM-ACTION
#05 Medium openapi/tickets.yaml:88
tickets.update is missing maximum bound on 'fields'
evidence: unbounded object · broad write
recommend: Document or enforce a field allowlist.
confidence: medium · check_id: SHIP-SCHEMA-MISSING-BOUND
18 findings · support_refund_agent fixture · report_schema_version 0.5 generated 2026-04-30T03:40Z

What it catches

Hover findings to highlight, or browse the full check catalog.

Wildcard tools Approval gaps Broad scopes Free-form action fields Missing bounds Idempotency gaps Dynamic surfaces Baseline drift

Total findings18
Fixturesupport_refund_agent
Sources3
Suppressed0

Proof from public tool surfaces, without turning the homepage into docs.

Four public examples show the scanner on realistic SDK/framework code and larger API surfaces. Each card keeps one representative finding visible and leaves full output in GitHub or docs.

OpenAI Agents SDK2 toolsHigh findings

Airline customer service agent

Static AST extraction finds a write-capable update_seat tool without enough release-review evidence.

Representative findingupdate_seat changes customer state and needs explicit scope and policy coverage.
Expand config excerpt
tool_sources:
  - id: openai_agents_sdk
    type: openai_agents_sdk
    path: main.py
environment:
  target: production_like
Anthropic Messages API3 toolsCritical

Cookbook customer service agent

A real published tool-use example includes cancel_order, a destructive action that needs approval evidence.

Representative findingcancel_order is destructive and ships without a declared approval policy.
Expand config excerpt
tool_sources:
  - id: anthropic_tools
    type: anthropic_messages
    path: tools.json
policies:
  approval_required_for: [destructive]
OpenAPI591 toolsStress test

DigitalOcean public API as agent tools

A cloud infrastructure API reframed as a broad agent surface exposes irreversible droplet, database, and Kubernetes operations.

Representative findingDestructive infrastructure actions are present without explicit approval policies.
Expand config excerpt
tool_sources:
  - id: digitalocean_openapi
    type: openapi
    path: openapi.yaml
permissions:
  scopes: ["*"]
OpenAPI167 toolsStress test

Twilio Messaging API purpose mismatch

A read-only manifest pointed at a messaging API still exposes DELETE-capable tools that contradict the declared purpose.

Representative findingRead-only release intent conflicts with message and phone-number deletion operations.
Expand config excerpt
agent:
  declared_purpose:
    - read messaging inventory
tool_sources:
  - id: twilio_openapi
    type: openapi
    path: messaging.yaml

Run locally, add advisory PR checks, then tighten when ready.

CI is advisory by default. Strict mode can fail only on unsuppressed critical findings once your team has reviewed the baseline.

Modes Local CLI PR advisory Strict CI Release review
Local CLI
bash
$ pipx install agents-shipgate
$ agents-shipgate init --workspace . --write
$ agents-shipgate scan -c shipgate.yaml

Requires Python 3.12+. Use python -m pip install agents-shipgate if pipx is not available.

.github/workflows/shipgate.yml
yaml
name: Agents Shipgate

on:
  pull_request:

permissions:
  contents: read

jobs:
  shipgate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@<pinned-sha>
      - uses: ThreeMoonsLab/agents-shipgate@v0.5.1
        with:
          config: shipgate.yaml
          ci_mode: advisory
          output_dir: agents-shipgate-reports

SARIF is available through the GitHub Action or scan output format configuration.

Local CLI

Fast feedback while editing the manifest or tool definitions.

PR advisory

Generate Markdown and JSON artifacts without blocking merges.

Strict CI

Fail only on unsuppressed critical findings once a baseline is reviewed.

Release review

Give platform and security owners a named report to discuss.

Built for repositories that need careful review.

The scanner is static by default. It does not execute agents, import user code, run tools, call LLMs, connect to MCP servers, make scanner network calls, or collect scanner telemetry by default.

Designed to be safe to run before production tools are connected. Open-source core. Transparent checks. Suppressions require reasons.

Inspect the checks. Run the fixture. Open an issue with a false positive.

Default scanner guarantees
 Static by default
  • No agent execution
  • No user-code import by default
  • No tool calls
  • No LLM calls
  • No MCP server connections
  • No scanner network calls by default
  • No scanner telemetry by default
  • Apache-2.0 open source

Not an eval tool. Not observability. Not a gateway.

Evals test behavior. Observability records runtime. Gateways enforce access. Shipgate reviews the tool surface before release.

Not
Evals
They test behavior.
Shipgate reviews release artifacts.
Not
Observability
It records runtime.
Shipgate runs before promotion.
Not
A gateway
Gateways enforce access.
Shipgate produces review evidence.
Shipgate
Release gate
A release gate for agent tool surfaces.
Static checks. Findings with evidence.
Side-by-side
Category What it answers What Shipgate answers
EvalsDid the model behave as expected?What tool surface are we releasing?
ObservabilityWhat happened at runtime?What should be reviewed before promotion?
MCP gatewayCan a tool call be allowed at runtime?Does this tool surface need release review?
Security scannerAre there known code or dependency risks?Are agent tools, schemas, scopes, and policies reviewable?
Governance platformHow do we manage org-level policy?What static release findings exist in this repo today?

Common release-review questions.

Short answers for developers, platform teams, and security reviewers before trying the scanner.

Does it call my agent or send my data anywhere?

No. The scanner is static by default: no agent execution, no user-code import by default, no tool calls, no LLM calls, no MCP server connections, no scanner network calls by default, and no scanner telemetry by default.

Is Agents Shipgate production-ready?

v0.5.1 is the current public release. Use advisory mode first to collect review evidence, then move to stricter CI behavior once your team has reviewed the baseline and suppression process.

How is this different from observability or runtime guardrails?

Observability records what happened at runtime, and guardrails enforce access at runtime. Agents Shipgate runs earlier: it turns declared tool surfaces into static release-review evidence before promotion.

Does it certify my agent as safe?

No. Agents Shipgate is not a safety certification, runtime gateway, or behavioral eval. It produces deterministic findings from tool definitions, schemas, scopes, and declared policies so release owners have evidence to review.

Start with static release checks. Add release evidence over time.

Today, Agents Shipgate makes agent tool surfaces reviewable before release. The next layers are baselines, suppressions, release history, policy drift, re-review triggers, and runtime evidence integrations.

01 · now

Tool-surface release checks

CLI + GitHub Action.

02

Release evidence

Reports, baselines, history, exceptions.

03

Runtime integrations

Trace evidence without replacing static review.

Get started

Run a fixture in 60 seconds.

$ pipx install agents-shipgate
$ agents-shipgate fixture run support_refund_agent

Have a real agent? We'll review your tool surface and walk through the findings with your team — help@threemoonslab.com