Skip to content
Pillar I: AI Organization · § 11

AI incident management

When an AI incident hits, two things determine the outcome. How fast we contain it and how clearly we explain it afterwards. Bizzi runs a four-tier severity model, a four-phase playbook, and a fixed PIR template. The response is deterministic instead of improvised.

The riskiest moment in any incident is the first hour, when the team is still deciding what happened. We pre-decide that. Severity rules are written down. The kill-switch is pre-wired. The disclosure SLA is fixed. The team’s job is to execute the playbook, not to invent it under pressure.

TierDefinitionAP-automation exampleResponse SLACommunication
SEV1Core function lost or sensitive data leakedAI Chat returns tenant A’s data to tenant B. All AI features down< 15 minCustomer Success immediately. Board within 1 hour
SEV2Major degradation or risk of spreadExtraction accuracy drops below 90% on a common template. Kill-switch triggered< 1 hourAffected customers within 4 hours
SEV3Localized fault with workaroundHallucination rate up on one use case. Latency spike in one region< 1 business daySquad lead handles. Included in weekly report
SEV4CosmeticConfidence color renders wrong on one browser< 1 weekBacklog
  • Automatic alert (see §9) or customer report.
  • On-call engineer confirms, classifies severity, opens an incident ticket.
  • For SEV1 and SEV2, trigger the kill-switch if needed. Disable the AI feature while keeping ERP Sync running.
  • Isolate the affected tenant or feature. Tighten rate limits. Block source IPs if it is an attack.
  • Take a forensic snapshot. State, logs, prompt history from the observability layer.
  • Identify an initial root cause.
  • Apply a temporary fix. Typically a model rollback through the AI Gateway.
  • Validate the fix in staging before production.
  • Restore service. Notify affected customers.

PIR is mandatory for SEV1 and SEV2 and optional for SEV3. Fixed template:

  1. Summary. What happened, who was affected, how long it lasted.
  2. Timeline. Exact sequence from detection to recovery.
  3. Root cause. 5-Whys or Fishbone. No personal blame.
  4. Action items. Short-term (1 week), mid-term (1 month), long-term (1 quarter). Each item has an owner and a deadline.
  5. Lessons. What worked, what did not, what surprised us.
  6. Customer disclosure. What we tell customers and when.

SEV1 PIRs go to the Board within 30 days. SEV2 PIRs go to the CoE within 14 days.

We commit to specific disclosure SLAs:

  • SEV1. Notify within 4 hours of confirmation, with a summary and an ETA for the fix.
  • SEV2. Notify within 24 hours, with a summary and a workaround.
  • SEV3-4. Aggregated in the quarterly transparency report.

When personal data is involved, we follow the notification obligations of Decree 13/2023. Notice to the competent authority and to the data subjects within the statutory window.