Launch Checklist¶

Final pre-GA sweep. Run this top-to-bottom in the week before Phase 8 — GA. Tick every box, capture the verification output, file a customer-side ticket for any failure.

Conventions: each item carries a one-line verify with a concrete command, query, or file path. {{TBD}} indicates a value the customer must fill in before running.

Infrastructure¶

[ ] Terraform applied cleanly to production — verify: terraform -chdir=terraform/perf plan after apply prints No changes.
[ ] State stored in S3 backend with locking — verify: terraform/perf/versions.tf has backend "s3" with dynamodb_table set; per terraform/perf/README.md §state.
[ ] EBS volumes encrypted — verify: aws ec2 describe-volumes --filters Name=tag:Name,Values=ebit-* --query 'Volumes[].{Id:VolumeId,Encrypted:Encrypted}' --output table shows all Encrypted=true.
[ ] IAM least-privilege — verify: SUT instance profile carries only AmazonECRReadOnly-equivalent + SSM read; no *:*.
[ ] No 0.0.0.0/0 ingress on app SGs — verify: aws ec2 describe-security-groups --filters Name=tag:Name,Values=ebit-* --query 'SecurityGroups[].IpPermissions[?contains(IpRanges[].CidrIp, \0.0.0.0/0`)]'` returns empty.
[ ] Admin SSH CIDR locked to a customer-owned range — verify: terraform.tfvars sets admin_ssh_cidrs explicitly; not relying on data.http.my_ip.
[ ] Internal datastore ports not exposed externally — verify: Postgres 5432, Redis 6379/6380, RabbitMQ 5672 not in any public SG inbound rule.
[ ] Egress allows ECR + apt + OTel exporter only — verify: egress rules whitelist customer-required outbound; document any wildcard egress decision.
[ ] DNS records propagated globally — verify: dig +short {{TBD: dropbet.example.com}} resolves from at least 3 geographic checkers.
[ ] TLS certificate valid + auto-renewing — verify: ACM cert in ISSUED state; renewal automation tested or noted.
[ ] ALB / ingress in front of admin-fe (if applicable) — verify: not relying on network_mode: host for production admin-fe (see Risks #5).
[ ] Postgres pool sized for expected load — verify: max_connections ≥ 300 (or PgBouncer in transaction mode); per-app ?connection_limit=N set per Risks #15.
[ ] Postgres backups configured — verify: AWS RDS automated backup retention ≥ 7d, or equivalent for self-hosted Postgres.
[ ] Redis persistence policy decided — verify: cache Redis appendonly/save policy matches RPO requirement; bot Redis can be ephemeral.
[ ] OTel collector resource limits set — verify: observability/otel-collector.yml memory_limiter configured; container has CPU/memory limits in compose.

Secrets¶

[ ] Doppler prd config audited end-to-end — verify: run docs/audits/doppler-perf-audit.md playbook against prd; commit the output diff.
[ ] No dev_* configs referenced from production paths — verify: aws ec2 describe-instances ... user-data and docker-compose.yml reference prd only.
[ ] NODE_ENV=production in prd — verify: doppler secrets get NODE_ENV --project ebit-api --config prd --plain returns production.
[ ] DEFAULT_LOG_LEVEL=info in prd — verify: same pattern as above.
[ ] DEBUG_LOGS_PRETTY=false in prd — verify: same pattern.
[ ] DISABLE_RATE_LIMMITING=false in prd — verify: same pattern.
[ ] FASTTRACK_JWT_* populated (stub or real) — verify: env validator does not block boot (Risks #1).
[ ] No ebit-personal Doppler tokens in prd — verify: doppler service-tokens list shows only customer-owned tokens for prd configs.
[ ] All vendor production keys in place — verify per vendor:
Sumsub: SUMSUB_APP_TOKEN, SUMSUB_SECRET_KEY
Payment provider: CCPAYMENT_* and/or NOWPAYMENTS_*
Captcha: RECAPTCHA_* or GEETEST_* (production keys, not sandbox)
Email: SENDGRID_API_KEY or SMTP credentials
Sentry: SENTRY_DSN for each app
[ ] JWT signing keys rotated for production — verify: JWT_SECRET, JWT_VERIFICATION_TOKEN_SECRET are not the .example.env defaults.
[ ] Admin default password changed — verify: ADMIN_DEFAULT_PASSWORD ≠ admin; first admin login forced password change documented.
[ ] Webhook secret per provider — verify: payment / KYC webhook signature secret matches provider config and Doppler.
[ ] Secret rotation runbook drafted — verify: a runbook exists under docs/runbooks/ covering Doppler key rotation and post-rotation app reload.

Observability¶

[ ] Grafana alerts firing on synthetic breach — verify: temporarily inject a 500 on a non-critical route; alert reaches the on-call channel within configured threshold.
[ ] service-overview dashboard populated — verify: grafana_url shows non-zero traffic on every panel after a smoke bet.
[ ] bullmq dashboard populated — verify: bullmq_queue_jobs panels show non-zero gauges for at least one queue.
[ ] redis dashboard populated — verify: ioredis ops + latency panels split cache vs bot Redis.
[ ] prisma-postgres dashboard populated — verify: Prisma model/method heatmap + Postgres slow queries + pool saturation visible.
[ ] browser-rum dashboard populated — verify: Web Vitals p75/p95 from ebit-fe present (requires @vercel/otel browser export).
[ ] logs-trace-pivot derivedFields working — verify: query Loki, click "View trace" link, lands on the matching Jaeger trace.
[ ] Loki retention configured — verify: Loki config sets retention to match data retention policy (Risks #16).
[ ] Jaeger Badger TTL configured — verify: Jaeger storage TTL set to a finite value to bound disk growth.
[ ] Sentry DSNs live — verify: trigger a synthetic error in each app (ebit-api, ebit-fe, ebit-admin-fe); event lands in Sentry within 30 s.
[ ] Source-maps uploaded — verify: a synthetic JS error in ebit-fe shows minified-to-source mapping in Sentry.
[ ] Custom OTel attributes captured — verify: a recent trace contains user.id (where applicable) and bet.id for a bet-place flow.
[ ] Trace blind spots documented — verify: AF-2 trace propagation gap noted in customer wiki (Risks #4).

Performance¶

[ ] Perf test executed within last 7 days against prod-equivalent — verify: timestamped k6 output and Grafana snapshot in the perf report.
[ ] Perf report signed — verify: every {{TBD}} in performance-test-report.md replaced; signed by customer-sre.
[ ] Auto-abort thresholds verified — verify: synthetic 2x-budget breach was injected and k6 aborted as expected during a dry-run.
[ ] Dice-bet SLO decision live — verify: documented re-baseline (SLO=150 ms) OR optimization ticket merged (Risks #7).
[ ] Queue stability holds at peak load — verify: PromQL bullmq_queue_jobs{state="wait"} shows zero sustained positive slope during the 27-minute ramp.
[ ] Kernel tuning applied on load-gen — verify: sysctl snapshot matches the table in docs/performance-testing.md §9.

Security¶

[ ] Security review completed — verify: signed review document attached; one row per Critical/High in docs/security-register.md.
[ ] Zero open Critical or High findings without signed mitigation — verify: register state is fixed or accepted-with-mitigation.
[ ] Dependency audit clean — verify: cd ebit-api && npm audit --production and cd ebit-fe && pnpm audit --production and cd ebit-admin-fe && pnpm audit --production show zero high/critical vulnerabilities, or each is documented.
[ ] No exposed admin endpoints — verify: curl -i https://{{TBD: api-host}}/admin/user/all without auth returns 401, not 200.
[ ] Captcha enforced on sign-up — verify: with production captcha key, a sign-up request without x-captcha-token is rejected; with 'pass' is rejected (NODE_ENV=production blocks the bypass per Risks #8).
[ ] Rate limiting active — verify: 30 rapid sign-in attempts to one email triggers the user-lockout cooldown (user_lockout:<email> key in cache Redis).
[ ] CORS configured — verify: cross-origin from a non-customer-owned domain is rejected on /auth/*.
[ ] CSP headers set — verify: curl -I https://{{TBD: dropbet-host}} shows content-security-policy with no unsafe-inline on scripts.
[ ] HSTS active — verify: response carries strict-transport-security ≥ 6 months.
[ ] Sensitive log fields redacted — verify: pino formatter strips password, accessToken, refreshToken; spot-check a sign-in failure log line.
[ ] 2FA enforced for admin SuperAdmin — verify: SF-029 mitigation in place — seeded admin@admin.com no longer bypasses MFA, or this account has been deleted from prd.

Operational¶

[ ] All runbooks reviewed by on-call team — verify: each file in docs/runbooks/ has been read and dry-run by the on-call team this calendar quarter.
[ ] On-call rotation set — verify: PagerDuty (or equivalent) rotation document; coverage matches the chosen SLO (24/7 vs business-hours).
[ ] Escalation matrix tested — verify: a test page reached L1 → L2 → L3 within the documented threshold; after-action attached.
[ ] Incident drill executed solo by customer team — verify: drill report exists; Evospin observer-only.
[ ] RCA template ready — verify: customer wiki has an RCA template; first incident slot reserved.
[ ] Backup restore drill executed — verify: a Postgres backup was restored to a scratch instance and a sanity query ran successfully within the last 30 days.
[ ] Disaster recovery plan documented — verify: customer ops doc covers Postgres + Redis + ECR loss scenarios; recovery time objectives stated.
[ ] Cost monitoring active — verify: AWS budget alerts on EC2, ECR, RDS, data egress; thresholds match expected spend.
[ ] Container resource limits set — verify: every service in docker-compose.yml (or production orchestrator config) has cpus and mem_limit set.

Compliance¶

[ ] Terms of Service live + linked from footer — verify: curl -s https://{{TBD: dropbet-host}}/terms returns 200 with the customer's legal text.
[ ] Privacy Policy live + linked from footer — verify: same pattern.
[ ] KYC flow tested in production credentials — verify: a test user submits production Sumsub flow; level transitions correctly.
[ ] AML policy attached + signed — verify: customer compliance has signed AML policy referenced in the production wiki.
[ ] Self-exclusion / responsible-gaming surface live — verify: a logged-in user can self-exclude via the user settings flow; subsequent sign-in attempts are blocked.
[ ] Data retention policy implemented — verify: Loki retention matches policy; PII tables have documented retention; Sentry retention set.
[ ] GDPR / CCPA data subject access workflow — verify: SOP exists for handling subject access requests; admin endpoint to export user data tested.
[ ] Gaming license active for target jurisdiction — verify: license document on file; geo-IP ban list in country/ module reflects unauthorised jurisdictions (Risks #18).
[ ] Legal review of in-product copy — verify: customer legal has signed off on bonus terms, withdrawal terms, dispute resolution copy.

Customer-side¶

[ ] DNS pointed to production — verify: as Infrastructure §DNS above; resolves to production load balancer or SUT.
[ ] Brand assets in place — verify: logo, favicon, palette match the brand pack; OG image set for social sharing.
[ ] Support email configured — verify: support@{{TBD: customer-domain}} reaches the customer support inbox; auto-reply on.
[ ] In-product help points to customer's help center — verify: footer "Help" link goes to customer-owned URL, not Evospin defaults.
[ ] Onboarding email sequence configured — verify: post-sign-up email cadence (welcome, KYC reminder, first-deposit) tested end-to-end.
[ ] Marketing analytics wired — verify: customer's GTM / Segment / equivalent receives events from dropbet (registration, deposit, first bet).
[ ] Status page live — verify: customer-owned status page exists at https://status.{{TBD: customer-domain}}; default state is operational.
[ ] Pricing / fees schedule public — verify: deposit / withdraw fees and limits page lives at the public URL agreed in Discovery.
[ ] Public communication plan staged — verify: customer marketing has draft press release + customer email + in-app banner ready for GA day.

Final go / no-go¶

The customer PM, customer SRE, customer compliance, and Evospin delivery lead each sign here:

Role	Name	Signed	Date
Customer PM	{{TBD}}
Customer SRE	{{TBD}}
Customer compliance	{{TBD}}
Evospin delivery	{{TBD}}

If any item above is unchecked at sign-time, it goes in the post-launch backlog with an explicit owner and date. Items in the Critical category (zero open Critical security findings, license active, captcha enforced, error rate < 0.1% in pilot) are non-negotiable — failing one of those rolls Phase 8 back to Phase 7.