Pillar V: AI Security · § 12

Incident response playbook

Pillar I §11 sets the platform-wide incident framework. This section narrows it to the AI-specific playbook. What changes when the failure mode is a prompt injection that succeeded, a model that started hallucinating, or a vendor that went dark. The four phases are unchanged from the platform playbook (Detect, Contain, Recover, Post-Incident Review). The moves inside each phase are different.

Detect

Detection sources for AI incidents:

Automated alerts. Anomaly detection on traffic shape, output validation failures, cost ceiling breaches.
Red-team findings. A scheduled red-team run that produces a previously-unknown bypass is an incident (Pillar I §10).
Customer reports. The disclosure path is security@bizzi.vn. Every report is triaged within the same 15-minute window.
Internal employee reports. Engineers, support, and sales who notice anomalies have a documented internal escalation channel.

Triage within 15 minutes covers severity (SEV1-4 per Pillar I §11), tenants and features affected, and a confirmed-incident vs false-positive verdict.

Contain

Containment is incident-type-specific. The playbook names a default move for each AI failure class.

Prompt injection succeeded. Quarantine the affected feature via the Kill-switch (§11). Rotate prompts. Begin scope investigation.
Data leak. Identify and revoke compromised credentials. Activate the Kill-switch immediately if the leak is cross-tenant.
Model misbehaviour. Roll back to the previous model version via the AI Gateway (Pillar IV §2). The gateway exists for this move.
Resource exhaustion. Tighten rate limits, block source IPs, activate the cost ceiling early.
Vendor compromise. Disconnect from the vendor. Cut over to the failover model defined in Pillar IV §6.

Containment is time-boxed. SEV1 must be contained within one hour of detection. SEV2 within four hours. If the time-box is missed, the incident is escalated to the CPTO and the AI Board is notified.

Recover

Recovery is more than “turn it back on.” For each incident type, recovery has a defined verification step before traffic returns.

Prompt injection. Verify the fix against red-team scenarios that include the original attack and at least three variants. Gradual re-enable (10% then 50% then 100%). Monitor closely.
Data leak. Investigate full scope. Notify affected parties per regulatory deadlines (Decree 13/2023 Art. 23). Produce a remediation report. Compensate per contract where applicable.
Model misbehaviour. Test the new model version through ADLC Stage 2 and Stage 3 (Pillar IV §7-8). Gradual rollout. No fast path.
Vendor compromise. Switch to the primary failover vendor. Renegotiate or terminate the original relationship through Pillar II §6.

In every case, the success criterion is the same. The scenarios that produced the incident now block correctly, and KPIs are monitored for 24 to 48 hours after restoration.

Post-incident review

Post-incident review is mandatory for SEV1 and SEV2 and uses a fixed template so the artifact is comparable across incidents.

Summary. What happened, who was affected, how long it lasted.
Timeline. The exact chain of events, including detection lag and decision points.
Root cause analysis. Five-Whys or Fishbone.
Action items. Short, medium, and long-term, each with a named owner and deadline.
Lessons. What worked, what did not, what surprised us.
Customer disclosure. The communication plan, the audience, and the regulatory obligations met.

SEV1 reviews are presented to the AI Board within 30 days. Action items are tracked centrally and reviewed quarterly by the CoE. Items that miss a 90-day close escalate to the Board.

Communication

Communication runs three audiences in parallel and at fixed cadences.

Internal. War room activated for SEV1 and SEV2. Updates every two hours during an incident. All-hands sync after SEV1 resolution.
External. Status page updates every 30 minutes for SEV1, every two hours for SEV2. Affected customer notification within 15 minutes for SEV1, within 24 hours for SEV2.
Regulatory. Competent authorities notified per legal deadlines (72 hours for GDPR-equivalent personal data breaches under Decree 13/2023 Art. 23). Cybersecurity authorities notified for cybersecurity incidents per local requirements.

Red-team feedback loop

Every incident becomes a red-team artifact. After resolution we add the exact attack vector to the standing scenario library, generate variants, and place the pattern into continuous monitoring. The point of doing this consistently is that an attack we have already seen never reaches the same severity twice.