How attackers are jailbreaking LLMs with CTF framing and how to catch them
Essential information
- Published
- 15/06/2026 21:33
- Modified
- 16/06/2026 11:48
- Source / Author
- AlienVault
- Confidence
- 100/100
- Report type(s)
- threat-report
- Labels / Tags
- ai agent exploitation ai platform targeting credential harvesting ctf framing cve exploitation cve-2026-33017 cve-2026-39987 cve-2026-40281 cve-2026-42208 cve-2026-42266 cve-2026-42271 cve-2026-42302 cve-2026-42589 cve-2026-44336 cve-2026-44694 cve-2026-45301 cve-2026-45331 cve-2026-45397 cve-2026-45672 cve-2026-47391 llm jailbreaking prompt injection rce campaigns
- Tags
- 2026-06-15 CVE-2026-33017 CVE-2026-39987 CVE-2026-40281 CVE-2026-42208 CVE-2026-42266 CVE-2026-42271 CVE-2026-42302 CVE-2026-42589 CVE-2026-44336 CVE-2026-44694 CVE-2026-45301 CVE-2026-45331 CVE-2026-45397 CVE-2026-45672 CVE-2026-47391 ai agent exploitation ai platform targeting credential harvesting ctf framing cve exploitation llm jailbreaking prompt injection rce campaigns
- Related entities
- 16 vulnerabilities (cve), 6 indicators, 6 observables
Description
Threat actors are bypassing AI model safety guardrails by framing exploit requests as legitimate security research, such as capture-the-flag challenges or CVE-hunting exercises. This technique manipulates upstream LLMs into generating working exploit code that attackers deploy against real targets. Multiple independent operators have been observed targeting five applications—PraisonAI, LiteLLM, FastGPT, Open-WebUI, and Gotenberg—using CVE-templated User-Agent strings and similar framing across multiple fields including passwords and AWS session names. The jailbreak framing leaks into every LLM-generated field because the model incorporates the prompt context into its output. This pattern represents a shift from manually written scanners to LLM-assisted exploit generation, creating detectable fingerprints across request headers, account aliases, and IAM session names that legitimate traffic rarely exhibits.