216.73.217.22

How attackers are jailbreaking LLMs with CTF framing and how to catch them

· Published 15/06/2026 21:33 · Modified 16/06/2026 11:48

Export JSON

Essential information

Published
15/06/2026 21:33
Modified
16/06/2026 11:48
Source / Author
AlienVault
Confidence
100/100
Report type(s)
threat-report
Labels / Tags
ai agent exploitation ai platform targeting credential harvesting ctf framing cve exploitation cve-2026-33017 cve-2026-39987 cve-2026-40281 cve-2026-42208 cve-2026-42266 cve-2026-42271 cve-2026-42302 cve-2026-42589 cve-2026-44336 cve-2026-44694 cve-2026-45301 cve-2026-45331 cve-2026-45397 cve-2026-45672 cve-2026-47391 llm jailbreaking prompt injection rce campaigns
Tags
2026-06-15 CVE-2026-33017 CVE-2026-39987 CVE-2026-40281 CVE-2026-42208 CVE-2026-42266 CVE-2026-42271 CVE-2026-42302 CVE-2026-42589 CVE-2026-44336 CVE-2026-44694 CVE-2026-45301 CVE-2026-45331 CVE-2026-45397 CVE-2026-45672 CVE-2026-47391 ai agent exploitation ai platform targeting credential harvesting ctf framing cve exploitation llm jailbreaking prompt injection rce campaigns
Related entities
16 vulnerabilities (cve), 6 indicators, 6 observables

Description

Threat actors are bypassing AI model safety guardrails by framing exploit requests as legitimate security research, such as capture-the-flag challenges or CVE-hunting exercises. This technique manipulates upstream LLMs into generating working exploit code that attackers deploy against real targets. Multiple independent operators have been observed targeting five applications—PraisonAI, LiteLLM, FastGPT, Open-WebUI, and Gotenberg—using CVE-templated User-Agent strings and similar framing across multiple fields including passwords and AWS session names. The jailbreak framing leaks into every LLM-generated field because the model incorporates the prompt context into its output. This pattern represents a shift from manually written scanners to LLM-assisted exploit generation, creating detectable fingerprints across request headers, account aliases, and IAM session names that legitimate traffic rarely exhibits.

External references