Skip to content

OWASP LLM Top 10 Coverage — LLMArmor

OWASP RiskRuleCoverageWhat’s Detected
Prompt InjectionLLM01🟢 Strong6 injection vectors with role-aware AST taint analysis
Sensitive Info DisclosureLLM02🟡 Partial4 LLM API key patterns across all file types
Supply Chain VulnerabilitiesLLM03🔴 Out of scopeRequires dependency analysis
Data and Model PoisoningLLM04🔴 Out of scopeRequires runtime monitoring
Improper Output HandlingLLM05🟡 Partialeval/exec/shell/SQL/HTML sinks with taint tracking
Insecure Plugin DesignLLM06🟡 Partial@tool functions with dangerous sinks
System Prompt LeakageLLM07🟡 PartialHardcoded prompts in source code and config files
Excessive AgencyLLM08🟢 Strong8 pattern categories including dynamic dispatch
MisinformationLLM09🔴 Out of scopeRequires runtime factual verification
Unbounded ConsumptionLLM10🟡 PartialMissing max_tokens on LLM API calls
  • 🟢 Strong — dual-layer detection (regex + AST taint analysis), multiple pattern categories, high confidence
  • 🟡 Partial — single-layer detection or limited pattern coverage; PRs welcome to expand
  • 🔴 Out of scope — not detectable by static analysis alone

LLM01 — Prompt Injection

  • Regex: direct interpolation via f-strings, .format(), %-formatting, and string concatenation
  • AST: source-based taint tracking — a variable is tainted only when assigned from a user-controlled data source (HTTP requests, input(), sys.argv, WebSocket messages, or function parameters)
  • Role-aware dict analysis: distinguishes the safe role: user pattern from dangerous role: system / role: assistant injection
  • str.join() injection detection for tainted list elements
  • Taint propagates through direct alias assignments but not through function calls

LLM02 — Sensitive Info Disclosure

  • All file types: OpenAI (sk-), Anthropic (sk-ant-), Google (AIza), HuggingFace (hf_) patterns
  • Minimum key length enforcement (20+ chars) to avoid matching SKUs and short placeholders
  • Comment lines, test/mock variable names, and example values are skipped

LLM05 — Improper Output Handling

  • Regex: detects LLM output variables (by name heuristic: requires both an LLM-context indicator such as llm, gpt, ai, chat AND a response indicator such as response, output, text, content) passed to dangerous sinks
  • AST: taint-tracked detection — flags any tainted variable (from any user-controlled source) passed to dangerous sinks without the name-heuristic requirement
  • @tool-decorated function parameters are treated as LLM output — the LLM chooses their values at runtime, so sinks inside @tool bodies are flagged automatically
  • Dangerous sinks: eval(), exec(), compile() → CRITICAL; subprocess.run(), os.system() → CRITICAL; Markup(), render_template_string(), mark_safe() → HIGH; SQL f-string interpolation → HIGH; json.loads() without schema validation → INFO (normal) / MEDIUM (strict)

LLM07 — System Prompt Leakage

  • Python: single-line + multi-line hardcoded system prompt strings (> 100 chars)
  • Config files: prompt values in system_prompt:, system_message:, prompt: keys
  • Only flags strings longer than 100 characters to avoid noise from short generic prompts

LLM08 — Excessive Agency

  • globals()[fn_name]() / eval(fn_name) — dynamic dispatch from LLM tool call → CRITICAL
  • tools=["*"] — wildcard tool access violating least privilege → HIGH
  • ShellTool(), PythonREPLTool(), CodeInterpreterTool() — shell/code execution capability → HIGH
  • subprocess.run(['powershell'/'bash'/'cmd'/'sh', ...]) — shell interpreter invocation → HIGH
  • @tool-decorated functions containing shell/subprocess sinks — AST-detected → HIGH
  • getattr(module, llm_name)() — AST-taint-tracked dynamic dispatch → CRITICAL (AST) / HIGH (regex)
  • auto_approve=True, human_in_the_loop=False — disabled approval gates → MEDIUM
  • FileManagementToolkit(), WriteFileTool() — broad filesystem access → LOW
  • Broad tool descriptions, missing explicit allowlists → INFO (normal) / MEDIUM (strict)

LLM10 — Unbounded Consumption

  • Regex: LLM API calls (openai, anthropic, litellm, Google Gemini) without max_tokens
  • AST: resolves **config dict spreads — suppresses the finding when max_tokens or max_output_tokens is provably present in the spread dict