API key management for AI agents: the complete 2026 guide

You wired an agent to something real. Claude Code can now open pull requests. Cursor talks to your GitHub MCP server. A LangChain script reads your inbox and answers tickets. Somewhere along the way you exported OPENAI_API_KEY, GITHUB_TOKEN, and maybe STRIPE_SECRET_KEY into your shell, the same keys a human used to hold, and now an autonomous process that can read files, spawn subprocesses, and call arbitrary URLs holds them too.

You have read "use environment variables" a hundred times. It is starting to feel insufficient, and you are right to be uneasy. This is the page I wish existed: why agent key management is genuinely different from app key management, the full ladder of options with honest tradeoffs, a decision you can act on today, and a best-practices checklist anchored to the 2026 frameworks that matter.

Why agent keys are not app keys

A traditional application reads a secret from its environment and uses it deterministically. The code that runs is the code you wrote and reviewed. Nothing in the request path is going to be talked into doing something else by a stranger.

An agent breaks all three assumptions. It decides at runtime what to call. It reads untrusted input (a web page, an issue comment, a file it was told to summarize) and that input can change its behavior. And it has wide latitude over the machine: it can read any env var, print a stack trace, write a file, or shell out to curl. The same GITHUB_TOKEN you exported for one MCP server is simultaneously visible to Cursor's MCP subprocesses, to Claude Code, to every LangChain script, and to every git and curl child process those spawn. One env, many promptable readers.

The data backs the unease up. GitGuardian's State of Secrets Sprawl 2026 reports tens of millions of new hardcoded secrets in public GitHub commits in 2025, a sharp year-over-year jump, with AI service secrets growing fastest. By GitGuardian's measure, commits authored by code assistants leak secrets at a meaningfully higher rate than the human baseline. And the finding that should make you put this post down and check a config file: thousands of unique secrets were exposed in MCP configuration files, a portion of them still valid. The tooling we adopted to wire agents up is itself becoming a leak surface. (See the report for the exact figures.)

OWASP put a name to the underlying risk. The Top 10 for LLM Applications 2025 lists LLM06 Excessive Agency, an agent granted more permissions or actions than its task needs, and moved Sensitive Information Disclosure up to #2. OWASP's stated mitigation is blunt: enforce authorization in downstream systems and apply complete mediation, validate every request, rather than trusting the model to limit itself. If you want the full picture of how a benign-looking summary task turns into a credential dump, two earlier posts cover it directly: how prompt injection becomes credential exfiltration and the four threat models for AI agent security in 2026.

The short version: it is no longer about where the key is stored. It is that a promptable process holds it at all.

The options ladder

There is no single right answer. There is a ladder, and each rung trades convenience for a smaller blast radius. The honest move is to climb only as high as your blast radius demands.

Rung 0: hardcoded in source

The thing everyone does once and regrets. A literal sk-... in a script, a token pasted into mcp.json. It works until the file lands in a commit, and the GitGuardian numbers above are the receipt for how often it does. Leaked credentials also tend to stay valid long after exposure, so a leak does not age out on its own. There is no defensible use of this rung. Move up.

Rung 1: environment variables

The default every framework pushes you toward. LangChain's ChatOpenAI infers the key from OPENAI_API_KEY if you do not pass one, and the canonical onboarding snippet is load_dotenv() then getpass:

python

import os, getpass
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

llm = ChatOpenAI(model="gpt-4o")  # reads OPENAI_API_KEY from the environment

LlamaIndex is the same story. Its default LLM is OpenAI and it expects OPENAI_API_KEY in the environment:

python

import os
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

os.environ["OPENAI_API_KEY"] = "sk-..."   # or rely on it being exported
Settings.llm = OpenAI(model="gpt-4o")      # picks up OPENAI_API_KEY from env

Cursor's MCP config funnels the same way. You configure servers in .cursor/mcp.json (project) or ~/.cursor/mcp.json (global), and secrets pass through a per-server env object. Cursor supports ${env:NAME} interpolation so you reference a shell variable instead of pasting the literal:

json

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${env:GITHUB_PERSONAL_ACCESS_TOKEN}"
      }
    }
  }
}

Cursor's docs say plainly: use environment variables for secrets, never hardcode, and prefer restricted, minimal-permission keys. There is also an envFile option, but it is available only for stdio servers, not remote ones, and Cursor loads MCP servers only at startup, so you must fully quit and reopen after editing. One caution: the popular advice that interpolation makes mcp.json "safe to commit" is third-party, not in Cursor's official docs. Treat it as a practice you adopt at your own risk, not a guarantee.

Env vars are a real improvement over hardcoding and, for a read-only key on a hobby project behind its own rate limit, genuinely fine. But the limits are well documented, and I will not re-argue them here because an entire post already does. The summary: the value is a plaintext string, it is inherited by every child process, and the autonomous reader is exactly the actor you cannot vet.

Rung 2: secrets managers

Doppler, HashiCorp Vault, and Infisical solve the things env vars cannot: central storage, versioning, rotation, and an audit trail. The good ones avoid writing a .env to disk by injecting secrets straight into a subprocess.

Doppler does this with doppler run, and it supports dynamic secrets: short-lived leased credentials with a configurable TTL, auto-deleted on expiry. The CLI mints a dynamic lease on run:

bash

doppler run -- python agent.py
doppler run --dynamic-ttl 15m -- python agent.py   # set an explicit lease TTL

Infisical, MIT-core and self-hostable on Postgres plus Redis, injects the same way and also offers dynamic secrets for Postgres, MySQL, MongoDB, and others, plus versioning and rotation:

bash

infisical run --env=prod -- python agent.py

(One naming trap: Infisical the secrets manager and Agent Vault the credential broker are both published by the same company but are different products. Do not conflate them.)

HashiCorp Vault is the heavyweight. Its signature is dynamic secrets, generated on demand with a lease ID and a TTL, auto-revoked at expiry:

bash

# Lease a short-lived credential; set an explicit TTL.
vault read -field=password database/creds/agent-role ttl=15m

One sharp edge worth flagging: Vault's default lease TTL, if you do not set one, is 32 days. "Dynamic" does not mean "short" unless you configure it. For unattended renewal, Vault Agent can auto-authenticate and auto-renew, writing fresh secrets to a file or env var.

Here is the catch that matters for agents, and it is the same for all three. At runtime they still hand the real key into the agent's process as a string. doppler run -- python agent.py injects the secret straight into the subprocess environment, which is precisely the surface a promptable agent can read, log, or be talked into exfiltrating. Secrets managers fixed storage, rotation, and audit. They did not change who holds the key at the moment of use. If you are reaching for AWS Secrets Manager specifically, there is a post on why it was not built for this.

Rung 3: OAuth instead of static keys

A static API key is a bearer secret with no expiry and usually no scope. OAuth replaces it with short-lived, scoped tokens and a refresh mechanism, which directly addresses two OWASP Non-Human Identity risks (long-lived secrets, overprivileged identities). The friction historically was that OAuth assumed a browser and a human, and agents are often headless.

The device code flow solves the headless case, and the mechanics are worth understanding in full. Beyond that, a genuine standards wave is forming around agent authentication. The 2026 agent auth landscape and WorkOS's open agent-registration protocol both cover where Dynamic Client Registration and friends are headed. OAuth is strictly better than a static key when the provider supports it, but on its own it still puts a token (a short-lived one, but a real one) into the agent's hands.

Rung 4: credential brokers and proxy injection

This is the only rung that changes the trust model rather than improving storage. A credential broker puts a local proxy on the outbound HTTP path. The agent's environment holds a placeholder, not a key. The agent makes a normal request, the proxy matches the destination and substitutes the real Authorization header on the way out, and the agent never reads or holds the secret at all.

That maps cleanly onto OWASP's "enforce authorization in a downstream system, apply complete mediation" mitigation, and it is the direct answer to the MCP-config leak stat: a secret that never enters the agent's process cannot be exfiltrated from it. There are several open-source options in this category, compared head to head in agent credential brokers in 2026, with a wider survey in top agent proxy tools and a hands-on field report in running agents without losing my keys.

Authsome is one of them. It is open source (MIT), local-first, and built for the single-user dev box rather than an org fleet. You authorize a provider once, then launch your agent under the proxy:

bash

uv tool install authsome
authsome login github          # browser PKCE, or device code over SSH/CI
authsome run -- python agent.py

Inside that process, the variable reads GITHUB_TOKEN=authsome-proxy-managed or similar. The real token lives in an encrypted SQLite vault under ~/.authsome/, never in the agent's environment, and the proxy swaps it in as the request leaves. There is also an in-process library mode for code that prefers to read credentials programmatically. The honest niche: bundled providers with automatic OAuth refresh and device-code flow, local-first with no cloud and no telemetry, single-user. It is not a multi-user or multi-tenant system, and it does not decide per-agent which agent may use which provider. For org and production fleets, a broker built to run on a separate host (Agent Vault, OneCLI) is the better fit, and for per-task human approval, Clawvisor. The comparison post lays out exactly which to pick.

A decision framework

Climb only as high as the blast radius demands. Honest defaults:

Situation	Right rung
Read-only key, hobby project, behind its own rate limit	Environment variable. Do not over-engineer.
Team needs storage, rotation, audit; code you wrote and trust	Secrets manager (Doppler / Infisical / Vault) with short TTLs
A promptable agent holds the key, or a write can move money or data	Credential broker (proxy injection) so the agent holds nothing
Org, multi-user, or a production fleet	Broker as a service (Agent Vault, OneCLI); add Clawvisor for per-task approval

The dividing line is rung 4. The moment an autonomous, promptable process is in the loop and the key can do something expensive or destructive, the question stops being "where is the key stored" and becomes "does the agent need to hold it at all." The answer is almost always no.

Best-practices checklist

Anchored to the OWASP Non-Human Identities Top 10, the first such list, released in 2025. Its directly relevant entries are Secret Leakage (#2), Long-Lived Secrets (#7), Overprivileged NHI (#5), NHI Reuse (#9), and Human Use of NHI (#10).

Scope every key to the minimum. A GitHub token that only needs to read one repo should not be able to delete others. This is OWASP NHI #5 and Cursor's own advice.
Prefer short-lived over long-lived. Dynamic secrets with a real TTL (Doppler, Vault, Infisical) or OAuth tokens that expire. Remember Vault's 32-day default; set the TTL explicitly.
Do not reuse one identity across agents. A separate credential per agent and per account contains the blast radius and makes the audit log legible. Managing multiple GitHub accounts walks through this for the common case.
Keep the secret out of the promptable process where you can. Proxy injection is the strongest form. If you cannot, at least keep it out of .env files and out of anything that lands in git.
Never commit config that embeds secrets. Interpolate (${env:...}), and audit your mcp.json and dotfiles against the GitGuardian MCP-config finding above.
Rotate, and have a revoke path. Automated rotation plus a one-command revoke. Verify offboarding actually invalidates the old credential.
Audit every read. An append-only log of every credential use is what turns an incident from a guess into a timeline.
Validate downstream, do not trust the model. OWASP LLM06: enforce authorization in the destination system and apply complete mediation. The agent self-limiting is not a control.

Putting it together

A realistic 2026 setup is not one agent, it is several promptable surfaces sharing a shell: Cursor with a handful of MCP servers, Claude Code with its own servers and skills, and a couple of standalone scripts. Each reads keys from the environment by default, so a single exported token is exposed to all of them at once. The fix is not a single tool, it is a discipline: scope keys tight, keep them short-lived, and where the blast radius justifies it, keep the secret out of the agent's process entirely.

For the multi-tool wiring itself, two posts go deep: wiring Claude Code to GitHub, Linear, and Stripe and the broader Claude Code production-ready setup. For the production-credentials angle specifically, running AI agents in production. And if MCP is new to you, start with the developer primer.

Warning

Before you ship anything, grep your repo and your dotfiles for sk-, ghp_, AKIA, and provider-specific prefixes, and scan your .cursor/mcp.json and ~/.cursor/mcp.json. A large share of agent credential leaks trace back to config files that nobody thought of as code.

Tip

The cheapest durable win is scope. A token that can only do the one thing the task needs turns most leaks from a breach into a shrug. Do that first, then climb the ladder.

There is no universal answer to API key management for AI agents, only the right rung for your blast radius. Most teams are one rung too low. Pick the rung your worst-case write deserves, then make it the boring default.

Next steps

Quickstart

Install Authsome and run your first agent under the proxy so it never holds a real key.

Cursor integration

Wire Authsome into Cursor and its MCP servers without pasting tokens into mcp.json.

LangChain integration

Use the in-process library mode so your scripts read credentials without exporting them.

Claude Code integration

Run Claude Code under the broker so MCP servers and skills see only placeholders.

API key management for AI agents: the complete 2026 guide

Why agent keys are not app keys

The options ladder

Rung 0: hardcoded in source

Rung 1: environment variables

Rung 2: secrets managers

Rung 3: OAuth instead of static keys

Rung 4: credential brokers and proxy injection

A decision framework

Best-practices checklist

Putting it together

Next steps

Quickstart

Cursor integration

LangChain integration

Claude Code integration

Further reading

API key management for AI agents: the complete 2026 guide

OpenAI API key hygiene for AI agents: project keys, restricted keys, and what an agent should actually use

Supply chain risks for AI agents: malicious MCP servers, poisoned skills, and how to triage