Skip to content

Customer Communications Templates

Reusable copy the customer team's support / comms / on-call uses during real incidents and scheduled work. The goal is to take guesswork out of "what do we tell customers right now" — pick the template, fill the variables, send.

Audience: customer-team Tier 1 / Tier 2 first responders, support, customer-comms owner. Templates are role-shaped, not channel-shaped — every template carries both a public version (status page / customer email / Twitter) and an internal version (Slack / Teams / PagerDuty incident comms).

Cross-links: ../oncall-runbook.md, ../support-model.md, ../escalation-matrix.md.


Decision tree — which template, when

The severity of the incident (set per ../oncall-runbook.md §1) drives both which template to use and which approval gate applies. Use this tree:

Is it an incident?
├── No  →  Is it scheduled work?
│           ├── Yes →  scheduled-maintenance.md (T-7d / T-24h / T-0 / done)
│           └── No  →  not a customer comm; nothing to send
└── Yes →  What's the severity?
            ├── P0 (full outage / data integrity / breach)
            │   1. acknowledgement (within 5 min)
            │   2. progress-update (every 30 min)
            │   3. resolved (immediate)
            ├── P1 (significant degraded service)
            │   1. acknowledgement (within 10 min)
            │   2. progress-update (every 60 min)
            │   3. resolved (within 1 hr of recovery)
            ├── P2 (partial issue)
            │   →  service-degradation.md (single notice; update only if scope grows)
            └── P3 (cosmetic / single-user)
                →  no public comm; ticket-tracker reply only

The four incident-flow templates (incident-acknowledgement.md, incident-progress-update.md, incident-resolved.md, service-degradation.md) carry the full template bodies. scheduled-maintenance.md carries the four maintenance-window stages.


Approval workflow

Approvals are designed for speed at high severity, safety at low — the more public the audience and the higher the impact, the more eyes on the message before it goes out.

Severity Public message (status page / email) Internal message (Slack / PagerDuty)
P0 IC drafts; on-call lead signs off before send. If lead unreachable: send the templated acknowledgement as-is, escalate sign-off to leadership for the next update. Templated send OK — IC posts directly.
P1 IC drafts; Tier 2 senior signs off before send. Templated send OK.
P2 Templated send OK (use service-degradation.md as-is). Templated send OK.
P3 No public comm. Optional internal note.
Scheduled maintenance Drafts go through customer-comms owner; T-7d notice requires leadership sign-off, T-24h / T-0 / done can use the templated send. n/a — Slack reminder is internal only, no approval needed.

Sign-off doesn't mean "wait for a meeting" — it means a single message in #oncall from the named role saying "approved, send." The expected turnaround is < 2 minutes for P0, < 5 for P1.


Channels

Pick channels by audience. A single incident often hits multiple channels in parallel — the IC owns sequencing.

Channel Use for Don't use for
Status page ({{TBD: customer-team-owned URL}}) Every P0 / P1 / scheduled maintenance — both public-facing and the canonical timeline of record P3, internal-only investigations
Customer email P0 / P1 customers under SLA contract; scheduled-maintenance T-7d notice Routine P2/P3
Twitter / public social ({{TBD: customer-team-owned handle}}) P0 only, mirrored from status-page wording Anything below P0
Slack #oncall All severities — internal play-by-play Customer comms
Customer Slack bridge ({{TBD: shared channel name}}) P0 / P1 active triage with the customer's engineering counterparts One-off questions (use email instead)
PagerDuty incident comms P0 / P1 — automatic timeline of every status-page update Manual updates already in Slack
Video bridge ({{TBD: zoom/meet/teams URL}}) P0 / P1 active triage Anything routine

Variable conventions

Every template uses {NAMED_LIKE_THIS} placeholders for find/replace. The most common variables across templates:

  • {INCIDENT_ID} — internal ID (e.g., INC-2026-0042)
  • {TIME_DETECTED} — ISO 8601 UTC, e.g., 2026-04-25T14:32:00Z
  • {IMPACTED_SERVICES} — short list, customer language: dropbet sign-in, bet placement
  • {NEXT_UPDATE_BY} — ISO 8601 UTC, the "we'll update by" promise
  • {IC_NAME} — incident commander
  • {CUSTOMER_NAME} — operator / partner brand (kept generic in templates)
  • {ROOT_CAUSE_SHORT} — one customer-language sentence; used in resolved + RCA templates
  • {ETA_RANGE} — never a single time; always a range with a hard upper bound (e.g., "30–60 minutes")

Stick to these names — the customer team's eventual incident-comms automation will rely on them.


Tone

Factual, no marketing, no over-apologizing. The customer trusts honest measurement more than "we deeply regret any inconvenience."

  • Reference: Atlassian's incident communication principles and Google SRE workbook ch. 9 — incident response — both align with this tone.
  • Don't speculate on root cause before you're sure. "We are investigating" is fine; "We believe this is caused by …" is not, until §3 of the incident-runbook confirms it.
  • Don't promise a fix time. Use ranges ({ETA_RANGE}) and always with an explicit re-confirmation of the next-update window.
  • Do name what users can / cannot do right now. "Sign-in is unavailable; existing sessions are unaffected" is more useful than "service is degraded."

See also