A response runbook for production incidents in Next.js and Kafka environments, with triage flow, ownership model, and postmortem template.
Incident classes
- Class A: checkout or ordering outage
- Class B: severe latency degradation
- Class C: partial feature failure with workaround
Triage flow
Diagram (Mermaid)
Mandatory incident artifacts
- timeline with UTC timestamps
- decisions and decision owners
- affected customer segments
- temporary mitigations and permanent fixes
Postmortem template
md
### What happened
### Why it happened
### What prevented faster recovery
### Corrective actions (owner + due date)Final takeaway
High-quality incident response is a repeatable operating system, not heroics from individual engineers.