Customer Communications Templates¶
Reusable copy the customer team's support / comms / on-call uses during real incidents and scheduled work. The goal is to take guesswork out of "what do we tell customers right now" — pick the template, fill the variables, send.
Audience: customer-team Tier 1 / Tier 2 first responders, support, customer-comms owner. Templates are role-shaped, not channel-shaped — every template carries both a public version (status page / customer email / Twitter) and an internal version (Slack / Teams / PagerDuty incident comms).
Cross-links:
../oncall-runbook.md,../support-model.md,../escalation-matrix.md.
Decision tree — which template, when¶
The severity of the incident (set per ../oncall-runbook.md §1) drives both which template to use and which approval gate applies. Use this tree:
Is it an incident?
├── No → Is it scheduled work?
│ ├── Yes → scheduled-maintenance.md (T-7d / T-24h / T-0 / done)
│ └── No → not a customer comm; nothing to send
└── Yes → What's the severity?
├── P0 (full outage / data integrity / breach)
│ 1. acknowledgement (within 5 min)
│ 2. progress-update (every 30 min)
│ 3. resolved (immediate)
├── P1 (significant degraded service)
│ 1. acknowledgement (within 10 min)
│ 2. progress-update (every 60 min)
│ 3. resolved (within 1 hr of recovery)
├── P2 (partial issue)
│ → service-degradation.md (single notice; update only if scope grows)
└── P3 (cosmetic / single-user)
→ no public comm; ticket-tracker reply only
The four incident-flow templates (incident-acknowledgement.md, incident-progress-update.md, incident-resolved.md, service-degradation.md) carry the full template bodies. scheduled-maintenance.md carries the four maintenance-window stages.
Approval workflow¶
Approvals are designed for speed at high severity, safety at low — the more public the audience and the higher the impact, the more eyes on the message before it goes out.
| Severity | Public message (status page / email) | Internal message (Slack / PagerDuty) |
|---|---|---|
| P0 | IC drafts; on-call lead signs off before send. If lead unreachable: send the templated acknowledgement as-is, escalate sign-off to leadership for the next update. | Templated send OK — IC posts directly. |
| P1 | IC drafts; Tier 2 senior signs off before send. | Templated send OK. |
| P2 | Templated send OK (use service-degradation.md as-is). |
Templated send OK. |
| P3 | No public comm. | Optional internal note. |
| Scheduled maintenance | Drafts go through customer-comms owner; T-7d notice requires leadership sign-off, T-24h / T-0 / done can use the templated send. | n/a — Slack reminder is internal only, no approval needed. |
Sign-off doesn't mean "wait for a meeting" — it means a single message in #oncall from the named role saying "approved, send." The expected turnaround is < 2 minutes for P0, < 5 for P1.
Channels¶
Pick channels by audience. A single incident often hits multiple channels in parallel — the IC owns sequencing.
| Channel | Use for | Don't use for |
|---|---|---|
| Status page ({{TBD: customer-team-owned URL}}) | Every P0 / P1 / scheduled maintenance — both public-facing and the canonical timeline of record | P3, internal-only investigations |
| Customer email | P0 / P1 customers under SLA contract; scheduled-maintenance T-7d notice | Routine P2/P3 |
| Twitter / public social ({{TBD: customer-team-owned handle}}) | P0 only, mirrored from status-page wording | Anything below P0 |
Slack #oncall |
All severities — internal play-by-play | Customer comms |
| Customer Slack bridge ({{TBD: shared channel name}}) | P0 / P1 active triage with the customer's engineering counterparts | One-off questions (use email instead) |
| PagerDuty incident comms | P0 / P1 — automatic timeline of every status-page update | Manual updates already in Slack |
| Video bridge ({{TBD: zoom/meet/teams URL}}) | P0 / P1 active triage | Anything routine |
Variable conventions¶
Every template uses {NAMED_LIKE_THIS} placeholders for find/replace. The most common variables across templates:
{INCIDENT_ID}— internal ID (e.g.,INC-2026-0042){TIME_DETECTED}— ISO 8601 UTC, e.g.,2026-04-25T14:32:00Z{IMPACTED_SERVICES}— short list, customer language:dropbet sign-in, bet placement{NEXT_UPDATE_BY}— ISO 8601 UTC, the "we'll update by" promise{IC_NAME}— incident commander{CUSTOMER_NAME}— operator / partner brand (kept generic in templates){ROOT_CAUSE_SHORT}— one customer-language sentence; used in resolved + RCA templates{ETA_RANGE}— never a single time; always a range with a hard upper bound (e.g., "30–60 minutes")
Stick to these names — the customer team's eventual incident-comms automation will rely on them.
Tone¶
Factual, no marketing, no over-apologizing. The customer trusts honest measurement more than "we deeply regret any inconvenience."
- Reference: Atlassian's incident communication principles and Google SRE workbook ch. 9 — incident response — both align with this tone.
- Don't speculate on root cause before you're sure. "We are investigating" is fine; "We believe this is caused by …" is not, until §3 of the incident-runbook confirms it.
- Don't promise a fix time. Use ranges (
{ETA_RANGE}) and always with an explicit re-confirmation of the next-update window. - Do name what users can / cannot do right now. "Sign-in is unavailable; existing sessions are unaffected" is more useful than "service is degraded."
See also¶
../oncall-runbook.md— severity classification, first-response checklist, common-pattern triage../support-model.md— tiers, SLAs, channel matrix, hours of coverage../escalation-matrix.md— severity × time → who's notified../../incidents/0000-template.md— RCA template (referenced fromincident-resolved.md)