Skip to content

Escalation Matrix

Severity × time elapsed → who is notified. Pin this page in the on-call channel topic.

Status: structure agreed at handover. Names, schedules, and contact details are filled in by the customer team during week 2 of oncall-readiness.md. Every {{TBD}} is a slot the customer team owns.


1. Severity → escalation timeline

For each severity, the table reads top-to-bottom as time elapses. If a step's success condition isn't met by the time budget, escalate to the next step.

P0 — full outage / data integrity / security breach

Time elapsed Action Success condition If not met
0 min PagerDuty page Tier 1 (primary on-call) Tier 1 ack escalate at 5 min
0 min Status page → red, "investigating" Page updated IC posts in #oncall
+5 min Auto-escalate to Tier 2 senior Tier 2 ack escalate at 10 min
+10 min Auto-escalate to Tier 3 (Evospin team) Tier 3 ack manual phone call to ebit-team lead
+15 min Customer-team leadership notified {{TBD: leadership contact}} acks retry every 5 min
+30 min If unresolved: vendor engagement (RNG provider, captcha, payment processor — whichever applies) Vendor ack continue Tier 3 + leadership
+60 min Mitigation deployed Site recovered continue with all parties; consider declaring emergency change-control

P1 — significant degraded service

Time elapsed Action Success condition If not met
0 min PagerDuty page Tier 1 Tier 1 ack escalate at 15 min
+5 min Status page → yellow, "investigating" Page updated IC posts in #oncall
+15 min Auto-escalate to Tier 2 senior Tier 2 ack manual ping in #oncall
+60 min Auto-escalate to Tier 3 (Evospin team) Tier 3 ack retry hourly
+4 hr Resolution target Service restored extend with explicit IC sign-off + customer-comms update

P2 — partial issue, non-critical

Time elapsed Action Success condition If not met
0 min Notify on-call channel #oncall Tier 1 ack within 15 min post follow-up at 30 min
+1 business day Ticket assigned in tracker Ticket has owner Tier 2 lead reassigns
+3 business days Resolution target Issue closed re-triage; consider promoting to P1 if user impact has grown

P3 — cosmetic, transient, single-user, or question

Time elapsed Action Success condition If not met
0 min Ticket in backlog Ticket created n/a
Next backlog grooming Triaged + prioritized Owner + target sprint assigned re-triage next grooming

2. Escalation paths — at a glance

              P0/P1                          P2                          P3
                │                            │                            │
                ▼                            ▼                            ▼
         Tier 1 — page                Tier 1 — channel              Backlog ticket
                │                            │                            │
       (no ack 5/15 min)             (no ack 30 min)                       │
                ▼                            ▼                            │
         Tier 2 — page              Tier 2 — channel                      │
                │                            │                            │
        (no ack 10/60 min)            (no progress 1 day)                 │
                ▼                            ▼                            │
   Tier 3 — ebit team page         Tier 3 — engaged for advisory only    │
                │                                                         │
    (no ack — phone call)                                                 │
                ▼                                                         │
   ebit-team lead direct                                                  │
                │                                                         │
              (P0)                                                        │
                ▼                                                         │
   Customer leadership +                                                  │
       vendor engagement                                                  │

Tier definitions: see support-model.md §1.


3. Contact info template

The customer team fills these in at handover. Until filled, every entry is {{TBD}} — the matrix is non-operational until at least one row in each tier is populated.

Tier 1 — Customer team first responder rotation

Field Value
PagerDuty schedule {{TBD}}
Slack channel {{TBD: e.g., #oncall}}
Primary contact (name, phone, email) {{TBD}}
Backup contact (name, phone, email) {{TBD}}

Tier 2 — Customer team senior on-call

Field Value
PagerDuty schedule {{TBD}}
Slack channel {{TBD: same #oncall, with @senior-oncall group}}
Primary contact (name, phone, email) {{TBD}}
Backup contact (name, phone, email) {{TBD}}

Tier 3 — ebit-team escalation

Field Value
PagerDuty schedule {{TBD: ebit-team-side schedule name}}
Slack channel {{TBD: shared bridge channel — e.g., #ebit-customer-bridge}}
Primary contact (name, phone, email) {{TBD}}
Direct phone number for break-glass {{TBD}}
Contractual SLA reference {{TBD: link to signed SLA doc}}

Customer-team leadership (P0 escalation only)

Role Contact
Engineering lead {{TBD}}
Product lead {{TBD}}
Comms / customer-comms owner {{TBD}}

Vendor contacts (engaged at +30 min on P0)

Vendor Service Contact
{{TBD: RNG provider}} Provably-fair seed source {{TBD}}
{{TBD: captcha provider}} Sign-up / sign-in captcha {{TBD}}
{{TBD: payment processor}} Deposits / withdrawals {{TBD}}
{{TBD: cloud / hosting}} Production infrastructure {{TBD}}

4. Validation cadence

This matrix is tested on a fixed schedule. Failed tests block on-call rotation onboarding.

  • Monthly: end-to-end test ping of every channel (PagerDuty, Slack, video bridge, email). Each contact acknowledges receipt within their tier's SLA. Any miss is a P2 ticket.
  • Quarterly: full simulated P0 drill. Tier 1 starts; full escalation chain runs; each tier confirms response time vs. matrix.
  • Per onboarding (every new on-call rotation member): the new engineer validates the matrix as their final week-2 task (oncall-readiness.md §5). They are not added to the rotation until validation passes.

5. References