LiteLLM CVE-2026-42208: When the AI Gateway Becomes the Cloud Account

Sysdig's threat research team timed it. From the moment the GitHub Security Advisory for CVE-2026-42208 was indexed, an unknown operator was running targeted SQL injection probes against production LiteLLM instances within 26 hours and 7 minutes. By the 36-hour mark, the campaign was hitting tables by name. Not a generic SQLmap spray. A custom enumeration of litellm_credentials.credential_values and litellm_config: the two tables that hold every upstream provider key the gateway proxies to. CVSS 9.3, pre-authentication, and the blast radius is best described as "cloud account compromise."

If you have not heard of LiteLLM, somebody on your engineering team probably has. It is the open-source LLM gateway with 22,000-plus GitHub stars, used as a unified front door to OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, and a few dozen other model providers. A single LiteLLM proxy in a corner of your infrastructure can hold an OpenAI organization key with five-figure monthly spend caps, an Anthropic console key with workspace admin rights, and an AWS Bedrock IAM credential, all in one row. CVE-2026-42208 turns that consolidation into a single point of total compromise.

The shape of the bug

The vulnerability lives in the proxy's API key verification path. When LiteLLM receives a request with an Authorization: Bearer sk-... header, it looks up that key in its PostgreSQL backend to check whether the caller is authorized. The patched code does this with parameter binding. The vulnerable code, in versions >=1.81.16, <1.83.7, concatenates the user-supplied bearer value directly into the SQL query string. A single quote in the header escapes the literal. Append UNION SELECT and the query returns whatever you want.

Two things make this exceptionally bad. First, the injection runs before authentication is decided, on every LLM API route the proxy exposes. Anything reachable on TCP/4000 is exploitable. There is no rate limiter, no IP allowlist, no captcha standing in the way. Second, the verification happens inside the error-handling path, which means the attacker does not need a valid endpoint or a real model name. POST /chat/completions with a malformed body is enough to reach the vulnerable query.

The maintainers' patch advisory in version 1.83.7-stable spells out the failure mode in plain language: "A database query used during proxy API key checks mixed the caller-supplied key value into the query text instead of passing it as a separate parameter." The fix is the textbook SQL injection remediation — parameter binding. The damage assessment is harder.

Why this is not a normal SQL injection

Application security teams have been triaging SQL injection findings for twenty years. The standard playbook says: figure out what data the vulnerable query touches, count rows, decide on severity. CVE-2026-42208 breaks that playbook because the data the query touches is not your application's data. It is your cloud account.

A working LiteLLM database, in production, typically holds:

OpenAI organization-scoped API keys with billing caps in the thousands or tens of thousands of dollars per month. Once stolen, attackers spin up Coinhive-equivalent crypto-mining workloads, run jailbroken models for resale on dark-web LLM-as-a-service offerings, or batch-generate phishing content at industrial scale until the bill clears the cap.
Anthropic Console keys with workspace admin permissions. These are not just request-signing tokens — they manage members, billing, and other keys.
AWS Bedrock IAM credentials. If the IAM role is over-scoped, which it almost always is in early AI deployments, the attacker pivots from the Bedrock invocation permission to whatever else the role can touch — S3 buckets, Lambda functions, KMS keys.
Azure OpenAI keys, Vertex AI service-account credentials, Cohere keys, Mistral keys, and the long tail of every provider an engineering team has experimented with.
The LiteLLM proxy's environment-variable configuration, which often contains the database password itself, the master key for the gateway, JWT secrets, and any logging/telemetry credentials.

A single successful extraction is closer to "the attacker now has the keys to your cloud-AI footprint" than "the attacker read some user data." Sysdig's framing in their advisory: "The blast radius of a successful database extraction is closer to a cloud-account compromise than a typical web-app SQL injection." That framing is correct. Treat this as if a developer's IAM access keys were leaked to the public internet, because functionally, they were.

The shadow AI gateway problem

Here is the part most security teams will not want to hear. CVE-2026-42208 is the disclosure event. The exposure event is that a team in your org probably stood up a LiteLLM instance six months ago, in a Kubernetes namespace nobody on the security side has cluster credentials for, fronting AI usage for a product feature that did not yet exist when the security baseline was written. It is the same shadow IT pattern that produced vulnerable Jenkins boxes in 2018 and vulnerable Argo CD instances in 2022, except the credentials it manages are far more valuable than build artifacts.

The signs that an LiteLLM proxy exists in your environment, in rough order of detectability:

A pod, container, or process listening on TCP/4000 anywhere in your infrastructure.
An ingress, service, or load balancer with litellm in the name.
An OpenAI bill that is much larger than the keys provisioned through your IAM portal would explain.
Egress traffic from your network to api.openai.com, api.anthropic.com, or model provider domains from sources other than your sanctioned applications.
A PostgreSQL database with tables prefixed litellm_.

Run the inventory first. Apply the patch second. The order matters because most organizations will discover at least one LiteLLM deployment they did not know about, and patching the one you know about does not help if there are two more.

What to do this week

The work is layered. Patch what you find. Rotate every credential the proxy could have touched. Build the controls that should have been there before LiteLLM showed up.

Layer 1: Find every LiteLLM deployment

Start with the network signature. LiteLLM defaults to TCP/4000. Even if your team rebound it, the inventory pass should look for both the default port and the process name across every cluster and VM:

# Kubernetes - across all namespaces
kubectl get pods --all-namespaces -o json | \
    jq -r '.items[] | select(.spec.containers[].image | test("litellm")) |
        "\(.metadata.namespace)/\(.metadata.name) \(.spec.containers[].image)"'

# Docker hosts
docker ps --format '{{.Names}} {{.Image}}' | grep -i litellm

# Any host - process listening on 4000
ss -tlnp 'sport = :4000' 2>/dev/null

# Network discovery from a security-tooling host
nmap -p 4000 --open -sV 10.0.0.0/8 --script http-title

Anything with a LiteLLM banner, X-LiteLLM-Version header, or a 401 response containing litellm in the body is in scope. Document every instance — namespace, cluster, owner team, the database backing it, the upstream providers it talks to.

Layer 2: Patch and assess exposure window

The fix is in 1.83.7-stable, released April 19. Anything older than that is vulnerable. Sysdig observed the first targeted exploitation attempt on April 26 at 16:17 UTC. If your instance was internet-reachable between April 19 and the day you patched, assume exposure and rotate every key the proxy holds.

# Helm
helm upgrade litellm litellm/litellm-helm \
    --version >=0.4.x \
    --set image.tag=v1.83.7-stable

# Docker
docker pull ghcr.io/berriai/litellm:main-v1.83.7-stable
docker stop litellm && docker rm litellm
docker run -d --name litellm \
    -v $(pwd)/litellm_config.yaml:/app/config.yaml \
    -p 4000:4000 \
    ghcr.io/berriai/litellm:main-v1.83.7-stable \
    --config /app/config.yaml

# Mitigation if you cannot patch immediately
# In litellm_config.yaml, under general_settings:
general_settings:
  disable_error_logs: true

The disable_error_logs: true mitigation closes the path through which untrusted input reaches the vulnerable query. It is not a substitute for upgrading. It is a stop-gap for the four hours between "we discovered we have LiteLLM" and "we have the change window to push the patch."

Layer 3: Rotate every credential the proxy could have touched

Assume database extraction occurred. Rotate every upstream provider key, every database password the proxy used, the LiteLLM master key, and any JWT secrets in the environment. Rotation is not optional and it is not a "we will get to it" line item, because the attacker only needs the keys to be valid for the few minutes it takes to schedule a long-running model job that bills against your account.

OpenAI: revoke and reissue at platform.openai.com/api-keys; review usage in the org dashboard for unexpected spend.
Anthropic: revoke at console.anthropic.com/settings/keys; review workspace member changes.
AWS Bedrock: rotate IAM access keys for the identity in the LiteLLM config; pull CloudTrail for InvokeModel and CreateModelInvocationJob events from unexpected sources.
Azure OpenAI: regenerate keys in the resource; check Azure Activity Log for unexpected resource changes.
The LiteLLM master key: regenerate, then re-provision every team and user virtual key.

Layer 4: Make the gateway stop being a single point of failure

The deeper problem is architectural. A single proxy holding every cloud-AI key for a whole organization is a centralization that pays off in operational simplicity and pays out catastrophically in a breach. The pragmatic next steps:

Put the gateway behind authentication you control. Do not rely on LiteLLM's API key check as the only auth boundary. Run it behind a proper API gateway (Kong, Envoy, even an authenticated nginx) or behind a VPN/zero-trust proxy. CVE-2026-42208 was pre-auth in the LiteLLM auth check; an upstream auth layer that requires a valid token before the request reaches LiteLLM at all would have prevented the entire incident.
Scope upstream credentials to the minimum. Use OpenAI's project-scoped keys instead of org-scoped keys. Use Anthropic's restricted keys, not workspace-admin keys. Use IAM roles with explicit Bedrock invocation permissions and nothing else, not full AmazonBedrockFullAccess.
Cap upstream spend at the provider level. Every provider supports per-key spend limits. Set them. The attacker who steals an OpenAI key with a $200 monthly cap can do a lot less harm than the one who steals a key with no cap.
Log and alert on every key request. LiteLLM has request logging built in. Pipe it to your SIEM. Alert on the patterns that indicate exfiltration: requests to new model providers, sudden volume spikes, requests with unusual user-agent strings.

Detection: what to hunt for now

If your LiteLLM instance was internet-exposed, look for the SQL injection signature in your reverse proxy logs. The pattern is a malformed Authorization header containing SQL syntax:

# Sample exploit traffic against /chat/completions
# Authorization header carries the injection:
# Bearer ' UNION SELECT credential_values FROM litellm_credentials --

# Splunk SPL - look for SQL syntax in Authorization headers
index=web sourcetype=nginx OR sourcetype=envoy
| where match(Authorization, "(?i)(union|select|--|';|/\*)")
| stats count by client_ip, uri_path, Authorization

# Common exploit signatures in the URL or headers
# 'OR'1'='1
# UNION SELECT
# litellm_credentials
# litellm_config

# nginx/envoy access log grep one-liner
grep -E "litellm_(credentials|config)|UNION.*SELECT" /var/log/nginx/access.log

Any matches mean the proxy was probed. Whether the probe succeeded depends on the version at the time of the request and whether disable_error_logs was set. Assume the worst until you have reviewed database query logs.

The bigger pattern

CVE-2026-42208 is not the last AI infrastructure vulnerability you will see. The pattern is generalizable. Open-source projects in the AI tooling space — LLM gateways, vector databases, agent frameworks, fine-tuning servers — are growing user counts faster than they are growing security review. The same gap that produced vulnerable Jenkins, Argo CD, and Kubernetes dashboards is producing vulnerable LiteLLM, vulnerable Triton Inference Servers (CVE-2025-23334 reached RCE earlier this year), vulnerable LMDeploy (CVE-2026-33626 disclosed two weeks ago), and a long pipeline of similar findings.

The defensive posture that scales is not "patch each one as it lands." That treadmill never ends and your engineering team will lose. The posture that scales is: treat AI tooling as the new Jenkins. Inventory it. Put it on a private network. Put authentication in front of it. Scope the credentials it manages. Monitor what it does. The CVE pipeline will keep producing findings. Your job is to make sure each finding is, at most, an inconvenience instead of a cloud account takeover.

Inventory the AI infrastructure your team already deployed.

Red Hound's recon and configuration review uses open-source tooling — nuclei templates for AI gateway fingerprinting, custom Sigma rules for SQL injection in Authorization headers, and a credential scoping checklist tied to your cloud provider. The tools and playbooks are public. We just run them faster than your team can find the time to. Book a 30-minute walkthrough on how to harden your AI gateway footprint without slowing down the engineering teams using it.

Book a Session