Support Model¶
The tiered service model the customer team operates after handover, plus the channel matrix and business-hours coverage map.
Status: this document captures the structure agreed at handover. Names, contact details, and contractual SLA wording are filled in by the customer team during week 2 of
oncall-readiness.md. Every{{TBD}}below is a slot the customer team owns.
1. Service tiers¶
Three tiers. Tier 1 and Tier 2 are the customer team. Tier 3 is the Evospin team. The escalation flows up; the answer flows back down.
Tier 1 — Customer team first responder¶
| Field | Value |
|---|---|
| Who | Customer-team operations engineer on rotation. {{TBD: rotation owner / PagerDuty schedule name}} |
| Scope | All player-facing symptoms, all admin-panel symptoms, all observability gaps. Acknowledge, triage, mitigate where a runbook exists. |
| Response SLA | < 30 min (business hours), < 2 hr (after hours) to acknowledge. Mitigation by end of shift for P2/P3. |
| Authority | Restart services, drain queues, run any documented runbook step, update the status page, post to customer comms channels. Cannot modify production code. |
| Escalation path | If runbook doesn't apply or mitigation requires code change → Tier 2. If P0/P1 and Tier 2 doesn't ack within 5 min → Tier 3 (Evospin team). |
| Knowledge prerequisites | Completed ../onboarding/curriculum.md checklist; signed off on shadow shift. |
Tier 2 — Customer team senior on-call¶
| Field | Value |
|---|---|
| Who | Customer-team senior engineer on rotation. {{TBD: rotation owner / PagerDuty schedule name}} |
| Scope | Take over IC from Tier 1 when escalated. Author and ship hotfixes. Make architecture-affecting calls during incidents. |
| Response SLA | < 1 hr always (business hours and after hours). |
| Authority | Everything Tier 1 can do, plus: deploy hotfixes, modify infrastructure, change feature flags, take destructive remediation actions (truncate stuck queue, force-restart prod DB, etc.). Cannot modify ebit-team-owned shared infrastructure. |
| Escalation path | If incident requires ebit-team-owned subsystem (RNG provider integration, fairness seed mechanism, third-party broker) → Tier 3. |
| Knowledge prerequisites | Completed oncall-readiness.md checklist; signed off on reverse-shadow; first solo on-call shift complete. |
Tier 3 — ebit-team escalation¶
| Field | Value |
|---|---|
| Who | ebit-team engineer on escalation rotation. {{TBD: ebit-team contact / PagerDuty schedule name}} |
| Scope | Subsystems the Evospin team retains ownership of post-handover: RNG / provably-fair seeds, the bj/speed-roulette state machines, the OTel collector pipeline, any vendor integration the customer team isn't credentialed for. |
| Response SLA | Per SLA contract — {{TBD: contractual SLA — typically <2 hr P0, <8 hr P1; customer-team to fill in from signed agreement}}. |
| Authority | Everything Tier 1 + 2 can do, plus: change ebit-team-owned shared services, rotate ebit-team-held secrets, engage vendor support on behalf of the customer team. |
| Escalation path | None — Tier 3 is the top. If Tier 3 cannot resolve, the incident becomes a vendor / business-level issue handled outside this runbook. |
| Knowledge prerequisites | ebit-team engineer with full repo access and production credentials. |
2. Support channel matrix¶
Pick the right channel for the urgency. Wrong channel wastes minutes.
| Channel | Use for | Don't use for | SLA |
|---|---|---|---|
| PagerDuty (page) | P0, P1 | P2, P3, questions | Tier 1 ack < 30 min business / < 2 hr after-hours |
Slack #oncall (mention) |
P2, ack confirmation, IC handoff | P0/P1 (page first, then post) | Ack < 1 hr |
Slack #ebit-support (post) |
P3, questions, "is this normal?" | Anything user-facing-broken | Best effort, business hours |
Email support@{{TBD: customer-team}}.com |
Account issues, billing, contractual | Anything time-sensitive | Next business day |
| Video bridge ({{TBD: zoom/meet/teams URL}}) | P0/P1 active triage, RCA retro | Anything routine | Spun up by IC at acknowledge for P0/P1 |
| Status page (customer-facing) | P0/P1 only — updates at acknowledge, at identified, at resolved | P2 and below — these are not user-visible | IC posts within 10 min of P0/P1 acknowledge |
The IC of an active incident pins the channel and bridge URL in the incident-channel topic. Anyone joining mid-incident reads the topic first.
3. Business-hours coverage map¶
Who is the on-call, when. The customer team owns this rotation post-handover; the table below is the structure to fill in.
| Time block | Tier 1 (first responder) | Tier 2 (senior) | Tier 3 (Evospin team) |
|---|---|---|---|
| Mon–Fri 09:00–18:00 ({{TBD: customer time zone}}) | Primary on-call | Available within 1 hr | Available within contractual SLA |
| Mon–Fri 18:00–09:00 | Primary on-call (after-hours response SLA applies) | Available within 1 hr | Available within contractual SLA |
| Sat–Sun, all hours | Primary on-call (after-hours response SLA applies) | Available within 1 hr | Available within contractual SLA |
| Public holidays | Primary on-call ({{TBD: holiday calendar}}) | Available within 1 hr | Reduced — see contractual SLA |
Rotation cadence: {{TBD: typically weekly handoff, customer team to confirm}}. Override / handoff happens via PagerDuty schedule edits.
4. Knowledge prerequisites — links¶
The tier definitions above reference onboarding completion. The actual checklists:
- Tier 1 readiness —
../onboarding/curriculum.mdend-of-week-1 checklist. - Tier 2 readiness —
oncall-readiness.mdend-of-week-2 (on-call go/no-go) checklist. - Tier 3 — ebit-team-internal; not covered in the customer handover kit.
A team member cannot take a tier without the corresponding checklist signed off. No exceptions during the first 90 days post-handover.
5. Drift management¶
The support model isn't static. Three review cadences keep it accurate:
| Cadence | What gets reviewed | Owner |
|---|---|---|
| Weekly (during the first month post-handover) | Incident counts per tier; whether tier definitions are matching reality | Tier 2 lead |
| Monthly (after first month) | SLA achievement vs. agreement; channel matrix accuracy; rotation health | Customer-team operations lead |
| Quarterly | Tier 3 contractual SLA review with Evospin team | Customer-team + ebit-team joint review |
Any change to this document outside those cadences requires sign-off from both Tier 2 lead and ebit-team escalation lead — record the change in the doc's change log (when added: {{TBD: a "Changes" section at the bottom of this file once the first edit lands post-handover}}).
6. References¶
escalation-matrix.md— severity × time elapsed → who's notifiedoncall-runbook.md— first-response procedure used by every tier../onboarding/curriculum.md— Tier 1 readiness gateoncall-readiness.md— Tier 2 readiness gateREADME.md— handover-kit entrypoint