Severity and Classification
Severity levels give everyone a shared language for “how bad is it?” and “who needs to be involved?”
They drive prioritization, escalation, and communication during an incident.
Why Severity Levels Matter
Section titled “Why Severity Levels Matter”- Prioritization — Not every alert is an incident; severity helps you decide what gets immediate attention versus what can be handled in the backlog.
- Escalation — Clear levels (e.g. SEV-1 vs SEV-2) define when to page additional people, involve the Incident Commander (IC), or convene a full response with Communications Lead (CL).
- Communication — Stakeholders and customers need a consistent way to understand impact. “We have a SEV-2” is faster than a long explanation.
A Simple Severity Scale
Section titled “A Simple Severity Scale”You can use a numeric scale (SEV-1, SEV-2, SEV-3) or labels (P1, P2, P3). What matters is that each level is defined by impact and urgency.
| Level | Typical meaning | Who is involved |
|---|---|---|
| SEV-1 / P1 | Critical: broad impact, service down or severely degraded; revenue or safety at risk. | Full response: IC, Operations Lead (OL), CL. All hands until mitigated. |
| SEV-2 / P2 | Major: significant impact, workaround may exist; many users or key workflows affected. | Designated responders; escalate to IC/CL if not contained quickly. |
| SEV-3 / P3 | Minor: limited impact, or impact only in non-critical paths. | Normal support or on-call; no need to convene full incident response. |
Define impact in terms that matter to your organization: user count, error rate, revenue, or compliance. Urgency is how fast you need to act.
Review and tune the matrix periodically so it stays useful.
When To Declare an Incident
Section titled “When To Declare an Incident”The line between “handle as normal work” and “declare an incident” should be clear so responders know when to open an incident channel and assign roles.
- Alert fires — Monitoring or observability alerts surface a symptom (e.g. error rate spike, latency degradation). See Alerting for how to design alerts that surface real issues without noise.
- Verify — The first responder confirms the symptom is real and not a fluke or limited test traffic.
- Classify — If the impact and urgency meet your criteria for SEV-1 or SEV-2 (or your equivalent), declare the incident: assign severity, open the incident ticket or channel, and follow your Incident lifecycle (assign IC, OL, CL as needed).
- If it does not meet criteria — Handle as a normal fix or backlog item; no need to spin up the full response.
When in doubt, declaring a lower severity and then downgrading is safer than not declaring and missing coordination or comms.
How Severity Ties To the Lifecycle
Section titled “How Severity Ties To the Lifecycle”In Incident lifecycle phase 0 (Preparation), you define a severity matrix so that in phase 1 (Detection & Declaration) the first responder can declare severity (SEV-x) and open the incident.
Severity then drives who is convened (phase 2: roles), how often you communicate (phase 4), and which incidents get a full Post-incident review (phase 6).