LiteLLM PyPI Package Compromised by Supply Chain Attack
Malicious versions of LiteLLM on PyPI contained a three-stage credential stealer that harvested SSH keys, cloud tokens, and crypto wallets.
LiteLLM versions 1.82.7 and 1.82.8 on PyPI were compromised on March 24, 2026. The malicious releases contained a three-stage credential stealer that harvests SSH keys, cloud provider tokens, cryptocurrency wallets, and Kubernetes secrets, then exfiltrates them to an attacker-controlled server. LiteLLM is a widely used Python library that provides a unified interface to dozens of LLM providers, with over 40,000 GitHub stars and transitive dependency chains reaching into frameworks like DSPy and CrewAI.
The compromise was discovered by FutureSearch when an MCP plugin running inside Cursor pulled the package as a transitive dependency. The machine ran out of RAM due to an exponential fork bomb, which turned out to be a bug in the malware itself. PyPI has since quarantined the package, blocking all downloads.
The .pth Attack Vector
Version 1.82.8 introduced a file called litellm_init.pth into the package. Python .pth files execute automatically on every interpreter startup when present in a site-packages directory. You do not need to import LiteLLM. You do not need to use it. If the package was installed in your environment, the payload ran on every Python process.
Version 1.82.7 used a different delivery mechanism: base64-encoded malicious code injected into litellm/proxy/proxy_server.py, which executes on import. Both versions carried the same credential-stealing payload.
No corresponding tag or release exists on the LiteLLM GitHub repository. The packages were uploaded directly to PyPI using a stolen publishing token.
Three-Stage Credential Stealer
The payload operates in three stages, according to FutureSearch’s analysis.
Collection. A Python script harvests sensitive files from the host: SSH private keys and configs, .env files, AWS/GCP/Azure credentials, Kubernetes configs and service account tokens, database passwords, .gitconfig, shell history, cryptocurrency wallet files, and anything matching common secret patterns. It also runs commands to dump environment variables and query cloud metadata endpoints (IMDS, container credentials). If you are running this on an EC2 instance or GKE pod, instance credentials are exposed.
Exfiltration. The collected data is encrypted with a hardcoded 4096-bit RSA public key using AES-256-CBC, bundled into a tar archive, and POSTed to models.litellm[.]cloud, a domain unrelated to LiteLLM’s actual infrastructure.
Lateral movement and persistence. If a Kubernetes service account token is present, the malware reads all cluster secrets across all namespaces and attempts to create privileged alpine:latest pods on every node in kube-system. Each pod mounts the host filesystem and installs a persistent backdoor at /root/.config/sysmon/sysmon.py with an accompanying systemd service. On local machines, it attempts the same persistence via ~/.config/sysmon/sysmon.py.
TeamPCP Attribution and the CI/CD Chain
The attack follows a pattern established over the previous week. TeamPCP compromised Aqua Security’s Trivy vulnerability scanner on March 19 by force-pushing malicious release tags in the trivy-action GitHub Action. On March 23, they hijacked release tags in Checkmarx’s KICS GitHub Action using the same technique.
LiteLLM’s CI/CD pipeline used the compromised Trivy scanner. The LiteLLM maintainer confirmed on Hacker News that the attack originated from this Trivy integration, which leaked the PYPI_PUBLISH token stored as a GitHub environment variable. The accounts had two-factor authentication enabled, but the token alone was sufficient to publish malicious versions directly to PyPI.
A commit pushed to the LiteLLM maintainer’s forked repository read “teampcp owns BerriAI,” consistent with TeamPCP’s pattern of leaving calling cards. The GitHub issue reporting the compromise was closed as “not planned” while the maintainer’s account appeared to still be compromised.
Downstream Impact
LiteLLM is not just a standalone library. It is a transitive dependency in agent frameworks like DSPy and CrewAI, meaning developers who never directly installed LiteLLM may have pulled in the compromised version through another package. The .pth execution vector makes this particularly dangerous: the malware ran on every Python process, regardless of whether the host application used LiteLLM at all.
The LiteLLM maintainer confirmed that the proxy Docker image was not impacted because it pins dependency versions in requirements.txt. Users running the self-hosted proxy via Docker were not exposed.
Response and Remediation
The impacted versions (1.82.7, 1.82.8) have been deleted from PyPI. All maintainer accounts have had credentials changed, and all publishing tokens for GitHub, Docker, CircleCI, and PyPI have been revoked. The package remains in quarantine, which blocks all downloads of any version.
If you installed or upgraded LiteLLM on or after March 24, check your version with pip show litellm. If you are on 1.82.7 or 1.82.8, assume all credentials accessible from that machine are compromised. Uninstall the package, purge your pip and uv caches, search for ~/.config/sysmon/sysmon.py on any affected host, audit kube-system for pods matching node-setup-*, and rotate every exposed credential: SSH keys, cloud tokens, database passwords, and API keys. For broader context on how supply chain attacks target developer tooling, the Glassworm campaign from earlier this month used a different technique (invisible Unicode payloads) but exploited the same fundamental trust model.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
What Are Parameters in AI Models?
Parameters are the numbers that make AI models work. Here's what they are, why models have billions of them, and what the count actually tells you about capability.
Glassworm Campaign Hides Malware in Blank Unicode GitHub Commits
Glassworm used invisible Unicode to hide malware across GitHub, npm, and VS Code—here's what developers should watch for.
Gimlet Labs Raises $80M Series A for AI Inference
Gimlet Labs raised an $80 million Series A led by Menlo Ventures to scale its multi-silicon AI inference cloud.
What Is Quantization in AI?
Quantization shrinks AI models by reducing numerical precision. Here's how it works, what formats exist, and how to choose the right tradeoff between size, speed, and quality.
What Is AI Inference and How Does It Work?
Inference is where AI models do their actual work. Here's what happens during inference, why it's the bottleneck, and what determines speed and cost.