Runbook: My trace isn't in Jaeger¶
Symptom¶
You triggered a request (sign-in, bet, page load) but the trace doesn't appear in Jaeger UI at http://localhost:16686. Either the service doesn't show up in the service dropdown or the specific trace_id is missing.
Likely causes¶
- OTel SDK not initialized (missing
pre-otel.main.tsimport) OTEL_EXPORTER_OTLP_ENDPOINTnot set or pointing at wrong host- OTel Collector unhealthy or not running
- Browser traces blocked by CORS (ebit-fe-browser service)
- Cross-service trace broken by Redis pub/sub transport (no traceparent propagation)
- Sentry gate conflict — Sentry's own trace SDK runs alongside OTel
Diagnosis¶
1. Check the NestJS app is exporting traces¶
# Verify env vars are set
sudo docker exec ebit-api env | grep OTEL
# Expected:
# OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
# OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
# OTEL_SERVICE_NAME=ebit-api
# Check pre-otel.main.ts is imported FIRST in main.ts
grep -n "pre-otel\|pre-sentry" ebit-api/apps/api/src/main.ts
# pre-otel MUST come before pre-sentry (Sentry wraps OTel if loaded second)
2. Check the OTel Collector is healthy¶
curl -sf http://localhost:13133/ && echo "healthy" || echo "UNHEALTHY"
# Check collector logs for errors
sudo docker logs --tail 30 ebit-otel-collector 2>&1 | grep -i error
# Verify Jaeger is receiving from the collector
sudo docker logs --tail 10 ebit-jaeger 2>&1 | grep -i "span"
3. Check browser traces (ebit-fe-browser)¶
# Verify CORS allows browser origin
grep -A10 "cors:" observability/otel-collector.yml
# Must include http://localhost:3000 in allowed_origins
# Open browser DevTools → Network tab, filter for "v1/traces"
# Look for POST to http://localhost:4318/v1/traces
# Check response: 200 = OK, 0/CORS = blocked
# Verify browser OTel initialized
# Console should show: "[otel-client] Browser OTel initialized"
4. Check cross-service trace propagation¶
# Redis pub/sub RPC (@ExternalControllerClient) does NOT propagate traceparent.
# If your trace starts in speed-roulette and calls ebit-api via wallet RPC,
# the callee creates a new trace root. This is a known gap.
#
# Workaround: search Jaeger by operation name or time range, not by trace_id.
5. Verify the trace landed in Jaeger¶
# Search by service name
curl -s "http://localhost:16686/api/traces?service=ebit-api&limit=5" | python3 -c "
import json,sys
data = json.load(sys.stdin)['data']
for t in data[:3]:
print(t['traceID'], len(t['spans']), 'spans')
"
Fix¶
| Cause | Fix |
|---|---|
| OTEL env vars missing | Add to docker-compose.yml service environment block |
| pre-otel.main.ts not imported | Add import '../../../libs/shared/src/basic/pre/pre-otel.main'; as first line in the app's main.ts. For bj/bo, use NODE_OPTIONS: "--require @opentelemetry/auto-instrumentations-node/register" |
| Collector unhealthy | sudo docker compose up -d --force-recreate otel-collector |
| Browser CORS blocked | Add origin to observability/otel-collector.yml cors.allowed_origins, restart collector |
| Redis RPC gap | Known limitation — see docs/architecture.md. No fix available without a custom NestJS transport |
Prevention¶
- All new NestJS apps must import
pre-otel.main.tsbefore any other import inmain.ts - Run
curl -s localhost:16686/api/services | jqafter adding a new service to verify it appears in Jaeger - Browser RUM verification: open any page → DevTools Network → filter
v1/traces→ confirm 200 response