Compliance for AI agents: SOC 2, audit trails, and the credential question

Your SOC 2 auditor sends the standard request list. CC6.1, CC6.2, CC7.2. Provide evidence of logical access controls over protected assets. Provide evidence that new internal and external users are authorized before credentials are issued. Provide a unified real-time security event stream.

You have a Claude Code agent that opens GitHub PRs, a Cursor instance that hits Stripe in test mode, and a nightly LangChain job that touches a production Postgres. You have a customer security questionnaire on your desk asking, in plain words, "who can your AI do what as?"

The honest answer right now is closer to "the bag of secrets the runtime had access to" than anything an auditor wants to hear. This post is a reference, not a pitch. It walks the controls that actually apply to agents, the logs your existing tools do and do not produce, and the one specific gap between the secrets-manager record and the agent task that nobody ships for you out of the box.

What the frameworks actually say

SOC 2 is still 2017 vintage, with 2022 Points of Focus

The AICPA's Trust Services Criteria have not been revised since 2017. In Fall 2022 the AICPA published Revised Points of Focus that update interpretive guidance, but the criteria themselves are unchanged. If you read a 2026 blog claiming there is a new AICPA "AI-agent TSC update," treat it as marketing until you can point at the AICPA document.

The criteria that bear on agent credentials are in the Common Criteria block.

CC6.1: logical access controls over protected assets, with explicit coverage of non-human service accounts as well as user accounts.
CC6.2: register and authorize new internal and external users prior to issuing credentials.
CC6.3: appropriate access based on role, with documented approval.
CC7.2: ongoing monitoring of irregular activity through a unified real-time security event stream.

CC6.1 already names service accounts. Your agent is a service account that happens to make non-deterministic choices about which scopes to use. The control text fits. The evidence pattern most teams have, "here is who logged into our prod console," does not.

ISO 27001:2022 plus Amendment 1 (2024)

The widely-reported 2024 change to ISO/IEC 27001 is small. Amendment 1, published February 2024, added climate-change considerations to Clauses 4.1 and 4.2. It did not add Annex A controls. If a vendor pitches you "ISO 27001 2024 added agent identity controls," they are wrong.

The load-bearing Annex A controls for non-human identity are the ones the 2022 revision already introduced.

A.5.16 Identity management: full lifecycle of identities, explicitly including non-human ones.
A.8.5 Secure authentication: how those identities authenticate.
A.8.15 Logging: what gets recorded, and for how long.

An auditor mapping your agent estate against ISO 27001 will go through A.5.16 first. "Show me the joiner-mover-leaver process for the OpenAI key your agent uses" is a fair question, and most teams cannot answer it.

EU AI Act Article 12, fully applicable 2026-08-02

If your agent does anything classified as high-risk under the AI Act, Article 12 requires "automatic recording of events (logs) over the lifetime of the system" sufficient to identify risk-relevant situations, support post-market monitoring, and operate human oversight. The text directs logs to cover the lifetime of the system. The mechanism must be technical and built into the system. Periodic manual review does not satisfy it.

Article 12 is silent on what fields the log must contain. It is loud on the property that the log must be durable and automatic. Most agent runtimes today emit per-call traces to stdout or to a vendor observability stack with short retention windows. Neither survives an Article 12 evidence request without changes.

The non-human identity frameworks auditors are starting to cite

Two reference documents have moved from "thought leadership" to "auditors actually mention them" over the last twelve months.

The OWASP Non-Human Identities Top 10 (2025) is the closest thing to a shared vocabulary. The items that map directly to AI agents are:

NHI1 Improper Offboarding: the agent goes away, the token does not.
NHI5 Overprivileged NHI: the token has scopes the agent never needed.
NHI7 Long-Lived Secrets: the static API key that has not rotated since the demo.
NHI9 NHI Reuse: one OPENAI_API_KEY shared across four agents and a cron job.
NHI10 Human Use of NHI: a human pastes the service token into their shell to debug, and now the audit trail is unrecoverable.

The Cloud Security Alliance's 2025 work backs up the operational picture. CSA's research on non-human identities in agentic AI describes a widespread shortfall in formal offboarding and revocation processes for API keys, and reports a non-human-to-human identity ratio that runs many times higher than one-to-one in modern enterprises. CSA also launched an NHI Management Fundamentals Certification specifically because of agentic AI.

On the standards side, the OpenID Foundation's "Identity Management for Agentic AI" work and the OIDC for Agents (OIDC-A) 1.0 draft propose claims for representing an agent, its attestation, and its delegation chain. Treat both as direction-of-travel. They are drafts, not ratified standards, and no auditor will hand you a finding because you do not implement them yet.

What the questions actually look like

In practice, an audit or a serious customer questionnaire turns into four questions about each agent. The frameworks above are the source. The questions are the output.

Identity. What identity did the agent use to call this third party, and is that identity registered and authorized (CC6.2, A.5.16)?
Least privilege. What scopes does that identity have, and were they the minimum needed (CC6.3, OWASP NHI5)?
Attribution. For a specific event on a specific date, who or what authorized the call, and which agent task ran it (CC7.2, EU AI Act Article 12)?
Revocation and rotation. When this agent is decommissioned or compromised, how do you kill its access without taking down anything else (CC6.7, OWASP NHI1, NHI7)?

Hold these four in your head while we walk through the tools.

What your existing tools actually log

The good news is that every major secrets manager and most SaaS providers do produce real audit data. The bad news is that the data is workload-scoped, not agent-scoped.

AWS Secrets Manager via CloudTrail

CloudTrail records every Secrets Manager API call with userIdentity, sourceIPAddress, eventTime, and the secret ARN. The catch is enabling capture in the first place: trails configured to exclude "Read" management events will silently drop GetSecretValue.

Turn on read events on the trail:

bash

aws cloudtrail put-event-selectors \
  --trail-name org-trail \
  --event-selectors '[{
    "ReadWriteType": "All",
    "IncludeManagementEvents": true,
    "DataResources": []
  }]'

Find every GetSecretValue in the last 90 days:

bash

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=GetSecretValue \
  --max-results 50

EventBridge pattern for live alerting:

json

{
  "source": ["aws.secretsmanager"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": { "eventName": ["GetSecretValue"] }
}

What you get is "the IAM role agent-runtime-prod read stripe/api_key at 14:02:11 from 10.0.4.12." What you do not get is which agent task on which user's behalf made that read.

HashiCorp Vault

Vault audit devices record entity_id, display_name, accessor (HMAC-hashed), policies, client_token (HMAC-hashed), and the full request and response paths. The single useful field for cross-token attribution is entity_id, the UUID that links a token to a Vault entity and survives token rotation.

Enable a file audit device:

bash

vault audit enable file file_path=/var/log/vault/audit.log

A single entry follows this schema:

json

{
  "type": "request",
  "auth": {
    "client_token": "hmac-sha256:...",
    "accessor": "hmac-sha256:...",
    "display_name": "approle-ci-agent",
    "policies": ["ci-read"],
    "entity_id": "8f4a-...-b21c",
    "token_type": "service"
  },
  "request": {
    "operation": "read",
    "path": "secret/data/stripe/prod",
    "remote_address": "10.0.4.12"
  }
}

Reconstruct "who read this secret":

bash

jq 'select(.request.path == "secret/data/stripe/prod") |
    {time, entity_id: .auth.entity_id, name: .auth.display_name, addr: .request.remote_address}' \
  /var/log/vault/audit.log

Same story as CloudTrail. The entity is the workload. The agent task is invisible.

GitHub

GitHub does not store the raw token in audit logs. To attribute events to a known PAT you compute the SHA-256 base64 hash of the token and search hashed_token:"<hash>". Audit log REST API access requires Enterprise Cloud; Free and Team plans cannot query it programmatically.

Hash the token:

bash

echo -n "$GITHUB_TOKEN" | openssl dgst -sha256 -binary | base64

Query the audit log:

bash

gh api -X GET "/enterprises/ACME/audit-log" \
  -f phrase='hashed_token:"EH4L8o6PfCqipALbL%2BQT62lyqUtnI7ql0SPbkaQnjv8"'

You can link "PR #482 was opened by token hash EH4L8o" to "that token was last issued to the ci-agent service principal." You cannot, from GitHub alone, link that to a specific Claude Code task or a specific user's prompt.

OAuth revocation, RFC 7009

When a token is compromised or an agent retired, RFC 7009 is the spec to actually use. Servers MUST support refresh-token revocation, SHOULD support access-token revocation. Revoking a refresh token invalidates all access tokens derived from the same authorization grant.

bash

curl -X POST https://issuer.example.com/oauth2/revoke \
  -u "$CLIENT_ID:$CLIENT_SECRET" \
  -d "token=$REFRESH_TOKEN" \
  -d "token_type_hint=refresh_token"

The endpoint returns HTTP 200 for invalid tokens by design, so do not infer success from the status code. Confirm by attempting a protected call.

The control most teams already pass

GitHub Actions OIDC is the model auditors point to when they want an example of how NHI should look. Each workflow job exchanges a GitHub-issued OIDC token for a cloud-issued access token valid only for that job's duration. No long-lived AWS access key on the runner, no rotation cron, no NHI7.

yaml

permissions:
  id-token: write
  contents: read
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::111122223333:role/github-deploy
          aws-region: us-east-1
      - run: aws s3 ls

The interesting question is not "how do we replicate this for cloud-to-cloud automation," because Actions already did. The interesting question is what the equivalent looks like for a local Claude Code invocation that needs to talk to Stripe at 14:02:11 on a Tuesday.

Why this matters right now: real CVEs against agent runtimes

The credential-exfiltration class is not theoretical. It is shipped, patched, and disclosed.

CVE-2025-6514 (mcp-remote): arbitrary OS command execution when an MCP client connects to an untrusted server, discussed in this CISO write-up.
CVE-2025-49596: unauthenticated MCP Inspector instances accept arbitrary commands.
CVE-2025-53110 and CVE-2025-53109: directory containment and symlink bypass in Anthropic's Filesystem MCP Server.
CVE-2025-59536 and a related advisory covered here: Claude Code project files (hooks, MCP config, env vars) allow shell command execution and API key exfiltration when a user opens an untrusted repo.

OWASP's MCP Top 10 puts "Token Mismanagement and Secret Exposure" at MCP01:2025 for a reason. Every one of these CVEs is, downstream, the same question your auditor is asking: when this happens, can you tell which credentials moved, and can you kill them without taking the rest of the agent fleet down with them?

If the answer involves rotating a shared OPENAI_API_KEY across four agents and a cron job, you already failed NHI9.

The specific gap nothing in the box closes

Walk back through the four audit questions with the toolchain in mind.

Identity. CloudTrail, Vault, and GitHub all give you a workload identity. None of them give you an agent identity. If agent-runtime-prod ran 9,402 GetSecretValue calls last month, none of those events tell you which Claude Code task or which user's prompt drove them.
Least privilege. You can scope the workload identity. You cannot, in any major secrets manager today, scope per-agent-task. Two agents on the same machine read the same vault.
Attribution. The downstream system records "service-account-stripe-rotator-prod did X." The agent runtime, at best, records the LLM trace. The middle hop, "this exact agent invocation got this exact credential for this exact tool call," is the gap. CSA's NHI governance work and OWASP NHI10 ("Human Use of NHI") both name it. Neither solves it.
Revocation. Killing the workload kills every agent on that workload. Rotating the shared key takes down four agents and a cron. Revoking at a finer grain than the shared workload key is the granularity SOC 2 CC6.7 and OWASP NHI1 are pointing at, and it requires somewhere in the stack to know which agent invocation a given credential read belongs to.

This is the part that needs a piece of code none of the major secrets managers ship. The credential broker pattern (see Secrets managers vs credential brokers for AI agents and Credential brokers in 2026 for the wider landscape) puts something between the agent process and the outbound HTTPS call. The agent's environment holds a placeholder, not a real key. The broker matches the destination at request time, swaps in the real Authorization header, and writes an identity-bound line tying the call to the principal that triggered it.

This is the only point in the stack where the principal that asked for the credential and the credential itself are simultaneously knowable. Once the request is on the wire, the secret is gone from your domain and the downstream log will only ever know about the workload identity.

A small, honest disclosure on tooling. Authsome is an open-source local-first broker that does exactly this part. The agent's env holds OPENAI_API_KEY=authsome-proxy-managed, the local proxy swaps in the real key as the request leaves, and every credential read, refresh, login, and revoke writes an append-only JSONL line under ~/.authsome/. Multi-account separation is per-connection (authsome login github --connection work vs --connection personal), so a "work" credential can be cut without touching "personal". Providers that are not bundled, including Stripe and Anthropic, are added as a small JSON file under ~/.authsome/providers/ rather than a code change. What Authsome does not do, and you should not claim in your SOC 2 evidence: replace CloudTrail or Vault, satisfy EU AI Act Article 12 on its own (you still need durable, long-retained, tamper-evident storage), enforce per-agent policy (a global allow/deny proxy mode exists; per-agent rules do not ship today), or stop you from issuing an overprivileged key in the first place. It closes one specific gap, the broker-side log of which credential was handed out under which connection, at the moment of use.

There are other brokers and gateways in this space. Agent credential brokers in 2026 walks the field. The point of this post is the gap, not the vendor.

A practical checklist you can hand an auditor

Use the four questions as the spine. The rows below are the evidence patterns that actually map.

Audit question	What to show today	Common gap
Who can the agent act as?	List of registered NHIs per agent (service account, OAuth client, PAT) with scopes and owner. Aligns with CC6.2 and A.5.16.	Agents share a single shared key. NHI9.
Were the scopes the minimum needed?	Scope diff between issued and used. For OAuth, the granted scope set; for PATs, the resource scope. Aligns with CC6.3 and OWASP NHI5.	Token issued with `repo` when `contents:read` was enough.
Who did what, when?	Workload-level: CloudTrail / Vault / GitHub audit. Broker-level: per-call log of credential reads tied to a connection or task identifier you supply. Aligns with CC7.2 and EU AI Act Article 12.	Workload log only. Agent task invisible.
How do you revoke?	Documented runbook with the RFC 7009 endpoint per provider, the secrets manager rotation path, and the broker-level revoke for separately-issued connections. Aligns with CC6.7 and NHI1.	"Rotate the shared key" cascades to everything that uses it.
What is your log retention?	Centralized log store with a documented retention policy that meets your highest-risk obligation for AI system paths. Aligns with EU AI Act Article 12.	Agent traces only live in vendor dashboards with short default retention.

Warning

The retention row is the one most teams underestimate. If your agent touches anything that lands in scope of the EU AI Act, the short default retention in most LLM observability vendors is not enough. Ship the JSONL or the structured trace to durable storage on day one. Retrofitting after a disclosure is painful.

Two patterns that move you most of the way there today

Most teams do not need a new identity standard to get to a defensible position. They need two changes.

Stop putting long-lived keys in agent environments. Either rotate the keys aggressively, use short-lived OIDC where the platform supports it (Actions, EKS Pod Identity, Workload Identity Federation), or move to per-call injection through a broker so the agent never holds the secret. The first option satisfies NHI7 but not NHI10. The second satisfies both but only works where OIDC is available. The third gives you a broker-side record of credential use that you can join to the rest of your evidence. There is a related write-up at Stop putting API keys in environment variables.

Build the join key now, not at audit time. Every agent invocation should have a stable task_id your own orchestration assigns. Every outbound credential use should be recordable against that task_id alongside the provider, the scope, and the outcome. CloudTrail, Vault, and GitHub keep the vendor side. The broker keeps the credential-issuance side. The task_id you assign in your own runner is the glue. Get that line on disk now, ship it to your log store with your existing pipeline, and you have something to show when CC7.2 comes up.

Note

SOC 2 does not prescribe a log format. It prescribes the property: you can answer "who did what" for events in scope. A JSONL line per credential read, joined to your existing CloudTrail and SaaS audit logs by task_id and timestamp, is a reasonable evidence pattern. Document it in your control narrative and an auditor will engage with it.

Where this is heading

OIDC-A 1.0 and the OpenID Foundation's agent identity work are the most concrete attempts to make agent identity a first-class concept in protocols you already deploy. Related work in the MCP and decentralized-identity communities is exploring how DIDs and verifiable credentials might extend into agent contexts. None of this is a standard you are graded against in 2026. All of it is worth tracking because the moment one of them ratifies, "agent identity" stops being a thing you build in your broker logs and becomes a thing auditors expect to see attested.

Until then, the honest position is that compliance frameworks today are silent on the specific shape of agent identity, loud on the requirement that credential use be attributable, and increasingly aware (via OWASP NHI and CSA) that the existing NHI tooling does not finish the job for autonomous workloads. The credential broker layer is the cheap, today-shippable way to close the attribution gap.

If you came in with an audit on the calendar, the order of operations is: enumerate the NHIs each agent uses, scope them down, get a per-call audit line with a task identifier on disk, ship the log durably for as long as your obligation requires, and document the runbook for revocation. Do those five things and you have a defensible answer to the four questions an auditor is actually going to ask.

Next steps

Quickstart

Install Authsome and run an agent so credentials inject at the local proxy and never sit in the agent's environment.

Secrets managers vs credential brokers

Why CloudTrail and Vault answer half the audit question, and what the other half looks like.

Running AI agents in production: credentials

The operational playbook for keeping agent credentials scoped, rotated, and attributable.

AWS Secrets Manager isn't built for AI agents

A deeper look at where the workload-identity model breaks down once agents enter the picture.

Compliance for AI agents: SOC 2, audit trails, and the credential question

What the frameworks actually say

SOC 2 is still 2017 vintage, with 2022 Points of Focus

ISO 27001:2022 plus Amendment 1 (2024)

EU AI Act Article 12, fully applicable 2026-08-02

The non-human identity frameworks auditors are starting to cite

What the questions actually look like

What your existing tools actually log

AWS Secrets Manager via CloudTrail

HashiCorp Vault

GitHub

OAuth revocation, RFC 7009

The control most teams already pass

Why this matters right now: real CVEs against agent runtimes

The specific gap nothing in the box closes

A practical checklist you can hand an auditor

Two patterns that move you most of the way there today

Where this is heading

Next steps

Quickstart

Secrets managers vs credential brokers

Running AI agents in production: credentials

AWS Secrets Manager isn't built for AI agents

Further reading

Compliance for AI agents: SOC 2, audit trails, and the credential question

OpenAI API key hygiene for AI agents: project keys, restricted keys, and what an agent should actually use

OpenAI API key hygiene for AI agents: project keys, restricted keys, and what an agent should actually use