Pillar V: AI Security · § 08

Output handling (LLM02)

LLM02 is the risk that downstream code trusts an LLM output the same way it would trust output from a vetted internal service. It does not. Every SQL statement, every snippet of HTML, every command-line argument an LLM produces is content of unknown provenance. Any code path that executes that content without validation is an injection vulnerability waiting for the right prompt. The rule is uniform across Bizzi. LLM output is untrusted input, and is handled as such.

The principle

Never trust LLM output absolutely. Especially when it generates code or SQL.

LLM02 covers all the classic injection categories at once. SQL injection through text-to-SQL, XSS through markdown rendering, command injection through shell tools, prototype pollution through generated JSON. The class is broad, and the defense is correspondingly layered.

Text-to-SQL. The central case

The AI assistant lets a user ask an analytical question. The agent generates SQL. The SQL runs on the OLAP database. This is the most likely place for a high-impact injection in our platform, so the defense is explicit and end-to-end.

The agent generates SQL against MCP (Pillar IV §12), which exposes only the tables and columns the feature is authorised to use.
MCP parses the SQL and validates. No DDL or DML (DROP, UPDATE, DELETE), a mandatory WHERE tenant_id = ? clause injected at the MCP layer, no UNION against tables outside the whitelist, and an automatic LIMIT when one is missing.
The validated SQL executes against a read-only OLAP user with the minimum privileges required.
The result is returned to the agent for formatting.

This is Defense-in-depth. If a prompt-injected agent generates malicious SQL, the MCP parser rejects it before it touches the database. If the parser is bypassed, the read-only user blocks anything destructive. If the user’s tenant filter is missing, MCP rejects the query.

Code generation

Bizzi currently exposes limited AI code generation. Where it does:

Generated code is scanned by static analysis before any execution path reaches it.
Execution happens in a sandboxed environment with no network egress and no privileged file system access.
Privileged operations require explicit human approval. The HITL pattern from Pillar III §3.

The output validation pipeline

Every LLM response passes through a five-step validator before it is served.

Type check. Output conforms to the expected schema (JSON shape, structured fields).
PII scan. No leakage of redacted values (see §7).
Cross-tenant scan. No references to data outside the current tenant.
Citation validation. Any citation points to a chunk or document that exists in the corpus.
Tone and safety check. No content that violates platform policy.

A response failing any step is not served to the user. It is logged for investigation and the user receives a graceful degradation.

Render-time XSS prevention

When AI output is rendered in the UI, the rendering layer treats it as untrusted markdown.

HTML escaping is on by default, even though we do not expect HTML in responses.
Markdown is sanitised through a fixed allowlist of nodes.
dangerouslySetInnerHTML is never used for AI content. Code review rejects it.
A strict Content Security Policy blocks inline scripts from executing.

Email and notification content

When AI participates in outbound email or in-app notifications, three rules apply. Content passes the same validation pipeline as in-app responses. The template structure is fixed and the AI only fills placeholders rather than generating raw content. Bulk sends require admin preview before they go out.