Architecture Decision Records — Index¶
Pointer page. Eight existing ADRs live in
../adr/. Each is a self-contained doc following the Context · Decision · Alternatives · Consequences · References template (../adr/README.md). The table below is the engineering-track index — annotated with status and a one-line "what to take away".
Existing ADRs¶
| ADR | Title | Status | Take-away |
|---|---|---|---|
| 0001 | Pino (framework) + Winston/EvoLogger (app-code) coexistence | Accepted | Both loggers run; both are trace-correlated; pino lands in Loki via OTLP, EvoLogger lands via filelog. Don't migrate EvoLogger sites — see ADR-0007. |
| 0002 | Spanmetrics connector over Prisma native /metrics | Accepted | Use traces_spanmetrics_* derived metrics for RED on span-only instrumentations (Prisma, ioredis, BullMQ). Don't reference prisma_* or db_client_* directly in dashboards. |
| 0003 | BullMQ for production async; RabbitMQ kept but stubbed | Accepted | All production queues ride BullMQ on cache Redis. RabbitMQ runs in compose (vhost ft) but only the Fast Track module is wired and its producer returns disabled = true. |
| 0004 | @vercel/otel pinned to 1.x on Next.js |
Accepted | 2.x removed propagateContextUrls which is essential for FE→API trace propagation. Don't bump without finding a replacement. |
| 0005 | OTel traceparent not propagated on Redis pub/sub RPC | Accepted | Three-hop @ExternalControllerClient flows (e.g., speed-roulette) surface as three uncorrelated Jaeger traces. Documented limitation, not a bug. |
| 0006 | Prisma schema split across 3 files | Accepted | api.prisma / blackjack.prisma / speed_roulette.prisma map to three Postgres schemas in one DB. multiSchema preview feature required. |
| 0007 | EvoLogger facade kept instead of migrating to pino | Accepted | ~40 call sites stay on EvoLogger.log/debug/error. Records still trace-tagged; no migration scheduled. |
| 0008 | fakeUserOnline inflates the online counter |
Accepted | The "online users" number visible to players includes a deterministic synthetic uplift; documented openly here, callable as a feature flag. |
| 0009 | Jaeger v2 + Badger over Tempo / managed OpenSearch | Accepted | Single-container Jaeger v2.17 with Badger LSM on 50 GB gp3 EBS. GOMEMLIMIT=1500MiB, 72 h TTL. Ends the v1 in-memory OOM (19.2 GB observed). Tempo is the documented fallback if high-cardinality writes overwhelm Badger. |
| 0010 | Doppler over HashiCorp Vault and AWS SSM Parameter Store | Accepted | Workspace ebit-devops with three projects (ebit-api, ebit-fe, ebit-admin-fe) and per-project + per-config service tokens. Vault rejected on ops cost; SSM rejected on UX + scoping; Secrets Manager rejected on cost (~$56/mo at our secret count). |
| 0011 | NestJS monorepo with 5 apps + 11 libs (over polyrepo) | Accepted | Single git tree, 5 independently-deployed apps (api, rt, bj, bo, speed-roulette), 11 shared libs. Atomic refactors and a single Prisma schema win over per-app polyrepo overhead. The cost is the cross-app trace gap codified in ADR-0005. |
| 0012 | Tail-sampling policy: 100 % errors + 100 % slow + 10 % OK | Accepted | OTel collector keeps every error trace, every trace where any span exceeds 500 ms, and 10 % of the rest. Drops ~90 % of OK traces — Jaeger min-duration=0 search returns fewer hits than were served; spanmetrics-derived RED metrics remain authoritative for aggregate behaviour. |
Status legend¶
Following ../adr/README.md:
- Accepted — current rule.
- Superseded by ADR-NNNN — old decision; the linked ADR replaces it.
- Deprecated — no longer applies; not necessarily replaced.
- Proposed — being discussed; not yet binding (we don't have any of these today).
Authored 2026-04-25 (was "Proposed new ADRs")¶
ADRs 0009–0012 were authored on 2026-04-25, plus ADR-0003 was expanded from its sparse 36-line form to a full ~200-line treatment. The previous "Proposed" entries are now reflected in the index above.
Summary of what landed:
- ADR-0009 — Jaeger v2 + Badger, anchored in
../audits/jaeger-storage-research.md. Records the v1 EOL constraint, the OOM forcing function, and the rejection of Tempo / managed OpenSearch / Keyspaces on cost. References Terraform config atterraform/modules/monitoring/jaeger-v2-config.yaml.tftpl. - ADR-0010 — Doppler over Vault / SSM / Secrets Manager / SOPS, codifying the workspace-
ebit-devopsdecision recorded in memoryreference_doppler_workspace. Vault rejected on ops cost (Raft, KMS auto-unseal); SSM rejected on per-token scoping + UX; Secrets Manager rejected on cost; SOPS rejected on workflow. - ADR-0011 — Five-app monorepo over polyrepo / single-app / Nx / pnpm-workspaces. Codifies the per-app deploy + per-app resource shape + atomic-refactor argument. Documents the trade-off (cross-app trace gap, history pollution) and the revisit triggers (rt bottleneck, second backend stack).
- ADR-0012 — Tail-sampling 100 % errors + 100 % slow (>500 ms) + 10 % OK, anchored in
terraform/modules/monitoring/user-data.sh.tftpl:144-167. Replaces the previous unwritten policy. Documents the storage budget math underlying the rejection of head-only and 100 % retention. - ADR-0003 (expanded) — BullMQ over RabbitMQ. Was 36 lines, now ~200 lines. Adds the full async-work inventory (10 BullMQ queues, 11 RabbitMQ stub call sites), considered alternatives D (migrate stub to BullMQ) and E (third broker), and revisit triggers tied to the Fast Track product decision.
How to add a new ADR¶
From ../adr/README.md:
- Use the next available number — e.g.,
ADR-0012. - Create
docs/adr/0012-short-title.md. - Fill in all sections, especially Alternatives considered. The point of an ADR is not just recording what we chose, but why we rejected the alternatives.
- Set status to
Accepted. - If the new ADR supersedes an old one, edit the old one's status to
Superseded by ADR-0012. - Add a row to
../adr/README.mdand to this page.
See also¶
../adr/— the ADR directory itself.stack.md— versions and rationale at a higher level than per-decision ADRs.architecture.md§Known structural debt — pointers from systemic issues back to the ADRs that document them.