"Agent proxy" is becoming the catch-all term for any tool that intercepts an agent's outbound traffic and does something useful with it: inject credentials, filter requests, log traffic, enforce policy, mock responses, or all of the above. The category is wider than people think, and the eight tools below all show up in real production setups for genuinely different reasons.
This isn't a comparison post. It's a survey. Each entry tells you what the tool is, what it's actually for, where it shines, where it doesn't, and what to know before you commit to it. The goal is to help you map your problem to the right shape of tool before you start picking on features.
The eight, grouped by what they solve:
- Credential injection: Authsome, Agent Vault, Clawvisor, OneCLI
- General-purpose interception and inspection: mitmproxy
- API gateway with AI-aware features: Pomerium, Cloudflare AI Gateway
- Mocking and testing: WireMock
Authsome
What it is: Local-first credential broker built specifically for AI agents on personal and dev machines. Python CLI plus a local daemon. MIT licensed.
The problem it solves: You have AI agents (Claude Code, Cursor, custom Python agents) that need credentials for dozens of services (GitHub, Linear, OpenAI, Stripe, Slack, etc.). You don't want those credentials in env vars, .env files, or shell history. Authsome holds them encrypted, refreshes OAuth tokens, and injects them at the proxy boundary so the agent process never sees the raw secret.
What's good:
- 44 bundled providers means you can
authsome login githuband the OAuth app, scopes, and refresh logic are all configured. - Device code flow works out of the box for SSH sessions.
- Multi-account is modeled directly with named connections.
- Runs on the same machine as your agent. No separate host to deploy. No PostgreSQL.
What's not:
- Single-user oriented. There's no concept of "team admin sets policy that users inherit". For a team of 50 engineers, you'd run 50 instances.
- The CLI surface is small but learning when to use connections vs profiles takes a minute.
Pick it if: You're a solo developer or small team where agents run on individual laptops. You want a local-first tool that doesn't add infrastructure.
Agent Vault
What it is: Open-source credential broker by Infisical, built as a Go binary that runs as a separate service. Web UI on port 14321, MITM proxy on 14322. github.com/Infisical/agent-vault.
The problem it solves: Production AI agents running in containers or VMs, with the credential vault running on a separate host. Multi-tenant from the ground up. The architecture explicitly assumes "don't run this on the same machine as your agent" because the threat model includes the agent host being compromised.
What's good:
- Production-shaped: Docker-native deployment, multi-tenant, web UI included.
- TypeScript SDK for orchestrators that need to mint short-lived tokens for ephemeral sandboxed agents.
- The "different host from your agent" stance is the right security opinion for production.
- Backed by Infisical, which has a track record in secrets management.
What's not:
- Bring your own credentials. No bundled provider definitions; you configure each service yourself with the dummy-value substitution pattern (
__anthropic_api_key__etc.). - No automatic OAuth refresh. It's an API-key vault.
- Heavyweight for the personal-laptop case.
Pick it if: You're deploying agents to production infrastructure where the broker runs on its own host. Multi-tenant team, ephemeral sandboxes, anything where the broker is a service.
Clawvisor
What it is: Open-source authorization gateway with task-scoped human approval. Hosted-first with self-host as an option. Includes LLM-powered intent verification on each request. github.com/clawvisor/clawvisor.
The problem it solves: Scenarios where the authorization layer matters as much as the credential layer. A user grants the agent a task scope ("read my emails, but ask before sending"), Clawvisor enforces that scope on every outbound call, and an optional LLM check verifies each request's parameters match the declared task purpose.
What's good:
- Task-scoped consent is a genuine innovation. Most brokers treat "agent has the credential" as binary; Clawvisor treats it as scoped per task with a time bound and purpose.
- LLM verification of intent on each call is useful for high-stakes agents (payments, deploys, customer-data access).
- Has the human-approval UX built in, not bolted on.
What's not:
- Hosted-first means most users adopt the SaaS by default; self-host exists but isn't the primary path.
- LLM verification adds latency to every outbound call.
- Connected-services model (Gmail, GitHub, Slack) is curated; not as extensible as a JSON provider definition.
Pick it if: Your agents do high-blast-radius things (email send, payments, deploys) and you want per-task human approval baked into the path.
OneCLI
What it is: Open-source credential gateway with a Rust HTTP gateway plus a Next.js dashboard. PostgreSQL backend, AES-256-GCM at rest, Bitwarden integration for pull-on-demand secrets. Apache-2.0 licensed. github.com/onecli/onecli.
The problem it solves: Multi-tenant gateway with a UI-first management experience. Similar architecture to Agent Vault but with a richer dashboard and the ability to fetch credentials on-demand from Bitwarden rather than storing them in the broker.
What's good:
- Rust gateway is fast and memory-efficient.
- Bitwarden integration is a clean "no secrets stored in the broker itself" option.
- Two-mode operation: single-user (no login) for local, or Google OAuth for teams.
- Solid documentation and example deploys.
What's not:
- PostgreSQL dependency is heavy for personal-laptop use (you run a database for one user).
- No automatic OAuth refresh; bring your own credentials.
- Dashboard-first means CLI users get a slightly less complete experience.
Pick it if: You're already running Postgres, you want a UI-first credential gateway, and the Bitwarden integration appeals.
mitmproxy
What it is: The OG HTTP/HTTPS interception proxy. Not AI-specific; it's a general-purpose tool that's been around for years. Used as the underlying engine in several of the agent brokers above. mitmproxy.org.
The problem it solves: You need to see what an agent is actually doing on the network. Every request, every response, every header. Useful for debugging "why is my agent failing on this API call", reverse-engineering an undocumented endpoint, or just sanity-checking that a credential is being sent the way you think it is.
What's good:
- Fully scriptable in Python. Custom interception logic in 10 lines of code.
- TUI, web UI, and CLI all included.
- Mature, well-documented, free.
- The reference implementation of TLS interception. Most other tools (including authsome's runtime proxy layer) rely on the same patterns mitmproxy pioneered.
What's not:
- Not opinionated about agents specifically. You'd write your own credential injection logic.
- The CA cert install is the same friction every TLS-intercepting tool has.
Pick it if: You want a general-purpose interception layer you can script. Or you want to inspect what your agent is doing in real time during debugging. Or both, simultaneously.
Pomerium
What it is: Identity-aware access proxy. Originally built for zero-trust access to internal services, increasingly adopted for "agent identity" workflows. pomerium.com.
The problem it solves: Per-request authorization for HTTP traffic, based on the identity making the request. In an agent context, that means: "this agent identity is allowed to call this internal service, but not that one; only with these scopes; only during these times". The agent doesn't need to authenticate to each internal service; Pomerium handles it.
What's good:
- Mature identity model. Plugs into existing SSO (Google, Okta, Auth0, etc.).
- Strong observability: every request logged with the identity that made it.
- Open source core, commercial enterprise tier.
- Designed for production from day one.
What's not:
- More infrastructure than a personal-laptop broker. Real deployment, real config, real ops.
- Not built specifically for AI agents; you're using a zero-trust gateway in an agent-shaped way.
- Doesn't model OAuth refresh against external APIs (it handles inbound auth, not outbound).
Pick it if: You're running agents inside an existing zero-trust environment and you want them to inherit the same access model your humans have.
Cloudflare AI Gateway
What it is: Cloudflare-managed proxy specifically for LLM provider traffic (OpenAI, Anthropic, Google, etc.). Sits between your app and the model provider. developers.cloudflare.com/ai-gateway.
The problem it solves: You're making LLM API calls at production scale and you want a layer that handles rate limiting, caching, retries, request logging, cost tracking, and observability without you building it. Also handy for falling back to a different provider when one is degraded.
What's good:
- Operationally robust: Cloudflare runs it, so you inherit their reliability.
- Cost-aware: dashboard tracks per-model spend, useful when several agents share a pool.
- Caching: identical prompts get cached responses, real money saver for repeated calls.
- Multi-provider fallback: if Anthropic is throttling you, route the same request to OpenAI.
What's not:
- Specifically for LLM provider traffic. Doesn't help with GitHub, Linear, Stripe, etc.
- SaaS by design. You can't self-host it.
- The credential layer is "you give us your provider keys"; not a broker pattern, more a passthrough with observability.
Pick it if: You're doing high-volume LLM calls and want observability + caching + fallback without building it yourself. Pairs well with a credential broker for the non-LLM traffic.
WireMock
What it is: Mock/stub HTTP server, used in testing. Apache 2.0 licensed. wiremock.org.
The problem it solves: You're testing an agent and you don't want to actually call the real API. Maybe the API costs money per call. Maybe the API rate-limits you. Maybe you want to test edge cases that are hard to reproduce against a live service. WireMock pretends to be the target API and returns canned responses based on rules you define.
What's good:
- Mature, used in enterprise testing for years.
- Rich matching DSL (headers, query params, body content).
- Can record/replay against a real upstream for capturing fixtures.
- Java-based but everyone speaks HTTP, so polyglot teams use it.
What's not:
- Not for production traffic. Strictly test/CI.
- Not AI-specific; you write your own response rules per endpoint.
Pick it if: You're testing agent behavior against API responses and want deterministic, cheap, fast fixtures. Pair with mitmproxy if you want to capture real traffic and turn it into WireMock stubs.
How to think about combining them
Most production setups use more than one of these. The combinations I see in practice:
- Authsome + Cloudflare AI Gateway: Authsome for the non-LLM provider credentials (GitHub, Linear, Stripe), Cloudflare in front of LLM traffic for caching and cost.
- Agent Vault + Pomerium: Agent Vault for outbound API credentials, Pomerium for inbound access from agents to internal services.
- OneCLI + WireMock: OneCLI in production, WireMock in CI, same set of "endpoints" configured in both.
- mitmproxy alongside any of the above: just for debugging. Spin it up when something breaks, tear it down when fixed.
The brokers (Authsome, Agent Vault, Clawvisor, OneCLI) are interchangeable for the core "intercept outbound, inject credentials" job. Pick one based on the deployment shape that matches your team. For a side-by-side feature comparison of the four brokers, see Agent credential brokers in 2026.
The non-brokers (mitmproxy, Pomerium, Cloudflare AI Gateway, WireMock) solve adjacent problems and compose nicely with whatever broker you pick.
What this category is missing
A few gaps I've noticed in the tooling so far:
- A standardized authorization model. Each broker has its own concept of identity, scope, and permission. There's no equivalent of "OPA but for agent proxies".
- First-class MCP awareness. Most of these tools predate MCP and treat MCP servers as just another HTTP endpoint. A broker that natively understood "this is an MCP
tools/callrequest, here's the tool name, apply policy" would be useful. - Better cross-tool observability. Each tool has its own dashboard. There's no Grafana-style "single pane of glass for everything that touched an agent".
The category is young. These gaps will close. Worth knowing what's missing today so you can plan for what you'll need tomorrow.
FAQ
Do I need a broker if my agent only calls one provider?
If the one provider has stable, long-lived credentials and you're not worried about prompt-injection exfiltrating them, you can probably get by with environment variables. Add a broker the moment you have a second provider or a real production deployment.
Can I write my own broker?
You can. You shouldn't, unless you have a very specific need. The off-the-shelf brokers above already handle TLS interception, OAuth refresh, multi-account, audit logging, and the operational gotchas. Rolling your own is a project, not a weekend.
Are these tools mutually exclusive with vendor SDKs?
No. The brokers work underneath whatever SDK your agent uses. The SDK makes a normal HTTPS call; the broker intercepts it. The SDK doesn't know the broker exists.
Which one runs on Windows?
Authsome and OneCLI explicitly support Windows. mitmproxy works on Windows. Agent Vault and Clawvisor's self-host paths assume Linux/macOS but should work in WSL. Cloudflare AI Gateway is SaaS so platform doesn't matter.
What about Datadog / New Relic / observability tools?
Different category. Observability tools watch your traffic; they don't intercept and modify it. They pair well with any broker (you point your APM at the broker's logs, you get per-call attribution).
Is there a way to test which tool is right for me?
Start with the broker that matches your deployment shape (laptop = Authsome, production service = Agent Vault or OneCLI, high-stakes approvals = Clawvisor). Run for two weeks. The friction points will tell you what's wrong faster than any comparison post.
Summary
The "agent proxy" category is wider than any single tool. Credential brokers (Authsome, Agent Vault, Clawvisor, OneCLI) handle the core "keep secrets out of the agent" problem. General-purpose interception (mitmproxy) is for debugging. Identity-aware gateways (Pomerium) handle inbound access. AI-specific gateways (Cloudflare AI Gateway) optimize LLM-provider traffic. Mocking tools (WireMock) handle testing.
The mistake is assuming one tool solves all of it. The right answer is usually a broker for credentials, plus one or two adjacent tools for the specific operational concerns (testing, debugging, internal access, LLM-specific observability) that your setup actually has.
If you're just starting out, pick the broker first. Everything else can wait until you have a real reason to add it.
Next steps
Further reading
What is MCP? A developer's primer on the Model Context Protocol
The protocol that connects AI agents to external tools. What MCP actually is, how the architecture works, what to build with it, and the auth questions nobody answers.
Read postMay 15, 2026AI agent security in 2026: the four threat models you actually need to think about
Prompt injection, credential exfiltration, runaway autonomy, supply chain. What each one looks like in practice, how attacks actually unfold, and which defenses work.
Read postMay 1, 2026Headless agent OAuth: the device code flow explained
How OAuth2 device authorization works, why it's the right pattern for SSH sessions and CI runners, and what the RFC doesn't quite tell you.
Read post