Open-source CLI and GitHub Action that scans MCP exports, OpenAPI specs, and SDK metadata, and writes a Tool-Use Readiness Report before your agent gets production-like tools.
A 60-second path to a real Tool-Use Readiness Report once Python and pipx are available. Requires Python 3.12+.
$ pipx install agents-shipgate
$ agents-shipgate fixture run support_refund_agent
Runs a bundled refund-support fixture and writes local report artifacts.
$ agents-shipgate init --workspace . --write
$ agents-shipgate scan -c shipgate.yaml
Creates a manifest, reads local tool sources, and generates a report for review.
- uses: ThreeMoonsLab/agents-shipgate@v0.5.1
with:
config: shipgate.yaml
ci_mode: advisory
Adds PR evidence without failing builds while your team triages findings.
agents-shipgate-reports/report.md
agents-shipgate-reports/report.json
SARIF is available through the GitHub Action or scan output format configuration.
Agents Shipgate reads declared local tool sources, normalizes them into a reviewable inventory, runs deterministic static checks, and writes a Tool-Use Readiness Report.
The landing page shows the release-review categories; the full check catalog stays in the repo.
Write, destructive, financial, or external communication tools without declared approval policies.
Wildcard MCP or inventory sources that expose an unreviewable tool surface.
Manifest or tool permissions that rely on wildcard or overly broad authorization scopes.
Fields such as body, command, action, or updates that let the model control too much.
Unbounded arrays, objects, strings, or numeric fields on side-effecting operations.
Write actions where retry behavior could duplicate refunds, updates, messages, or deletes.
Framework toolsets that cannot be statically reviewed without explicit inventory evidence.
Tools missing reviewer-friendly ownership, scope, approval, or prohibited-action coverage.
New, matched, and resolved findings when a reviewed baseline is present.
Findings include severity, evidence, source reference, confidence, and a recommended next action — built for engineering and platform review, not for dashboards.
$ agents-shipgate fixture run support_refund_agent
loading shipgate.yaml ........................ ok
loading fixture support_refund_agent .......... ok
reading shipgate.yaml ......................... ok
reading local tool inventories ................ ok
normalizing inventory ......................... 8 tools, 3 sources
running 18 static checks ...................... done
Status: Release blockers detected
Critical: 2 High: 14 Medium: 2
Human review: recommended
Top findings
crit stripe.create_refund · approval policy missing
crit stripe.create_refund · idempotency evidence missing
high wildcard_mcp_tools.* · unreviewable tool surface
high support.send_email · free-form body field
... 14 more findings
→ wrote agents-shipgate-reports/report.md
→ wrote agents-shipgate-reports/report.json
→ exit 0 (advisory mode)
// agents-shipgate-reports/report.json
{
"schema_version": "0.1",
"report_schema_version": "0.5",
"summary": {
"status": "blockers",
"critical_count": 2,
"high_count": 14,
"medium_count": 2,
"human_review_recommended": true
},
"generated_reports": ["report.md", "report.json"],
"tool_inventory": {
"tool_count": 8,
"source_count": 3
},
"source_warnings": [],
"findings": [
{
"id": "finding-001",
"severity": "critical",
"tool": "stripe.create_refund",
"check_id": "SHIP-POLICY-APPROVAL-MISSING",
"evidence": ["financial_action", "external_write"],
"source": "samples/support_refund_agent/shipgate.yaml",
"confidence": "high",
"recommendation": "Add an approval policy or remove this tool from the release."
},
{
"id": "finding-002",
"severity": "high",
"tool": "stripe.create_refund",
"check_id": "SHIP-SIDE-EFFECT-IDEMPOTENCY",
"evidence": ["external_write", "retryable_side_effect"],
"source": "samples/support_refund_agent/shipgate.yaml",
"confidence": "medium",
"recommendation": "Declare idempotency evidence or document retry handling."
},
{
"id": "finding-003",
"severity": "medium",
"tool": "support.lookup_customer",
"check_id": "SHIP-SCHEMA-BOUNDS",
"evidence": ["unbounded_string"],
"source": "samples/support_refund_agent/shipgate.yaml",
"confidence": "medium",
"recommendation": "Add schema bounds for customer lookup inputs."
}
]
}
Hover findings to highlight, or browse the full check catalog.
Four public examples show the scanner on realistic SDK/framework code and larger API surfaces. Each card keeps one representative finding visible and leaves full output in GitHub or docs.
Static AST extraction finds a write-capable update_seat tool without enough release-review evidence.
update_seat changes customer state and needs explicit scope and policy coverage.tool_sources:
- id: openai_agents_sdk
type: openai_agents_sdk
path: main.py
environment:
target: production_like
A real published tool-use example includes cancel_order, a destructive action that needs approval evidence.
cancel_order is destructive and ships without a declared approval policy.tool_sources:
- id: anthropic_tools
type: anthropic_messages
path: tools.json
policies:
approval_required_for: [destructive]
A cloud infrastructure API reframed as a broad agent surface exposes irreversible droplet, database, and Kubernetes operations.
tool_sources:
- id: digitalocean_openapi
type: openapi
path: openapi.yaml
permissions:
scopes: ["*"]
A read-only manifest pointed at a messaging API still exposes DELETE-capable tools that contradict the declared purpose.
agent:
declared_purpose:
- read messaging inventory
tool_sources:
- id: twilio_openapi
type: openapi
path: messaging.yaml
CI is advisory by default. Strict mode can fail only on unsuppressed critical findings once your team has reviewed the baseline.
$ pipx install agents-shipgate
$ agents-shipgate init --workspace . --write
$ agents-shipgate scan -c shipgate.yaml
Requires Python 3.12+. Use python -m pip install agents-shipgate if pipx is not available.
name: Agents Shipgate
on:
pull_request:
permissions:
contents: read
jobs:
shipgate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@<pinned-sha>
- uses: ThreeMoonsLab/agents-shipgate@v0.5.1
with:
config: shipgate.yaml
ci_mode: advisory
output_dir: agents-shipgate-reports
SARIF is available through the GitHub Action or scan output format configuration.
Fast feedback while editing the manifest or tool definitions.
Generate Markdown and JSON artifacts without blocking merges.
Fail only on unsuppressed critical findings once a baseline is reviewed.
Give platform and security owners a named report to discuss.
The scanner is static by default. It does not execute agents, import user code, run tools, call LLMs, connect to MCP servers, make scanner network calls, or collect scanner telemetry by default.
Designed to be safe to run before production tools are connected. Open-source core. Transparent checks. Suppressions require reasons.
Inspect the checks. Run the fixture. Open an issue with a false positive.
Evals test behavior. Observability records runtime. Gateways enforce access. Shipgate reviews the tool surface before release.
| Category | What it answers | What Shipgate answers |
|---|---|---|
| Evals | Did the model behave as expected? | What tool surface are we releasing? |
| Observability | What happened at runtime? | What should be reviewed before promotion? |
| MCP gateway | Can a tool call be allowed at runtime? | Does this tool surface need release review? |
| Security scanner | Are there known code or dependency risks? | Are agent tools, schemas, scopes, and policies reviewable? |
| Governance platform | How do we manage org-level policy? | What static release findings exist in this repo today? |
For the runtime boundary, read Agents Shipgate vs runtime guardrails.
Short answers for developers, platform teams, and security reviewers before trying the scanner.
No. The scanner is static by default: no agent execution, no user-code import by default, no tool calls, no LLM calls, no MCP server connections, no scanner network calls by default, and no scanner telemetry by default.
v0.5.1 is the current public release. Use advisory mode first to collect review evidence, then move to stricter CI behavior once your team has reviewed the baseline and suppression process.
Observability records what happened at runtime, and guardrails enforce access at runtime. Agents Shipgate runs earlier: it turns declared tool surfaces into static release-review evidence before promotion.
No. Agents Shipgate is not a safety certification, runtime gateway, or behavioral eval. It produces deterministic findings from tool definitions, schemas, scopes, and declared policies so release owners have evidence to review.
Today, Agents Shipgate makes agent tool surfaces reviewable before release. The next layers are baselines, suppressions, release history, policy drift, re-review triggers, and runtime evidence integrations.
CLI + GitHub Action.
Reports, baselines, history, exceptions.
Trace evidence without replacing static review.
$ pipx install agents-shipgate
$ agents-shipgate fixture run support_refund_agent
Have a real agent? We'll review your tool surface and walk through the findings with your team — help@threemoonslab.com