Skip to content

Add a Grafana Dashboard

Canonical example: observability/grafana/provisioning/dashboards/prisma-postgres.json.

1. Create the dashboard JSON

Add a new file under observability/grafana/provisioning/dashboards/. The provisioning config at dashboards.yml:12-13 auto-loads all JSON files from that directory every 30 seconds.

Start from this skeleton (mirrors prisma-postgres.json:1-9):

{
  "uid": "ebit-your-dashboard",
  "title": "ebit · Your Dashboard",
  "tags": ["ebit", "spanmetrics", "auto-provisioned"],
  "schemaVersion": 39,
  "editable": true,
  "refresh": "15s",
  "time": { "from": "now-1h", "to": "now" },
  "timezone": "browser",
  "templating": {
    "list": [
      {
        "name": "datasource",
        "type": "datasource",
        "query": "prometheus",
        "current": { "text": "Prometheus", "value": "prometheus" }
      },
      {
        "name": "service",
        "label": "Service",
        "type": "query",
        "datasource": { "type": "prometheus", "uid": "${datasource}" },
        "query": "label_values(calls_total, service_name)",
        "refresh": 2,
        "includeAll": true,
        "multi": true,
        "current": { "text": "All", "value": "$__all" }
      }
    ]
  },
  "panels": []
}

Conventions from the existing dashboards:

  • uid — prefix with ebit- for namespacing (e.g., ebit-prisma-postgres).
  • tags — always include ebit and auto-provisioned.
  • ${datasource} — the first template variable selects the Prometheus instance.
  • ${service} — populated from label_values(..., service_name).

2. Datasource UIDs

Three datasources are provisioned in datasources/datasources.yml:1-35. Reference them by UID in panel targets:

Name UID Type URL
Prometheus prometheus prometheus http://prometheus:9090
Loki loki loki http://loki:3100
Jaeger jaeger jaeger http://jaeger:16686

Always use the template variable ${datasource} for Prometheus panels so users can switch instances if needed.

3. Write PromQL for spanmetrics

All metrics come from the spanmetrics connector in the OTel Collector (observability/otel-collector.yml:74-83). It converts trace spans into Prometheus metrics.

Available metrics (note: no prefix, not traces_spanmetrics_*):

Metric Type Description
calls_total counter Total span invocations
duration_milliseconds_bucket histogram Latency distribution
duration_milliseconds_sum histogram Cumulative latency
duration_milliseconds_count histogram Same as calls_total

Available labels: service_name, span_name, status_code, span_kind, plus dimensions declared in the collector config: db.system, db.operation, db.sql.table, prisma.model, prisma.method.

Common PromQL patterns

Rate (ops/sec) — from prisma-postgres.json:52:

sum(rate(calls_total{service_name=~"$service", span_name="prisma:client:operation"}[1m]))

p95 latency — from prisma-postgres.json:66-68:

histogram_quantile(
  0.95,
  sum by (le) (rate(duration_milliseconds_bucket{service_name=~"$service", span_name=~"prisma:.*"}[5m]))
)

Error rate — from prisma-postgres.json:96-97:

sum(rate(calls_total{service_name=~"$service", span_name=~"prisma:.*", status_code="STATUS_CODE_ERROR"}[1m]))

Common span_name patterns: GET /path / POST /path (HTTP), prisma:client:operation (DB), redis-GET / redis-EVALSHA (cache + BullMQ).

4. Add panels

Each panel is an object in the panels array. Example stat + timeseries combo:

{
  "id": 1,
  "type": "stat",
  "title": "Requests / sec",
  "datasource": { "type": "prometheus", "uid": "${datasource}" },
  "gridPos": { "x": 0, "y": 0, "w": 6, "h": 4 },
  "targets": [
    {
      "expr": "sum(rate(calls_total{service_name=~\"$service\", span_name=~\"GET.*|POST.*\"}[1m]))",
      "legendFormat": "req/s"
    }
  ],
  "fieldConfig": { "defaults": { "unit": "ops", "decimals": 2 } }
},
{
  "id": 2,
  "type": "timeseries",
  "title": "p95 Latency",
  "datasource": { "type": "prometheus", "uid": "${datasource}" },
  "gridPos": { "x": 6, "y": 0, "w": 18, "h": 8 },
  "targets": [
    {
      "expr": "histogram_quantile(0.95, sum by (le) (rate(duration_milliseconds_bucket{service_name=~\"$service\", span_name=~\"GET.*|POST.*\"}[5m])))",
      "legendFormat": "p95"
    }
  ],
  "fieldConfig": { "defaults": { "unit": "ms" } }
}

Use gridPos for layout: Grafana uses a 24-column grid. w is width, h is height in grid units.

5. Commit and reload

# Grafana auto-reloads dashboards every 30 seconds (dashboards.yml:10)
# Or restart to force immediate load:
docker compose restart grafana

The provisioning config at dashboards.yml:11 has allowUiUpdates: true, so you can also edit in the Grafana UI first, export the JSON, then commit it.

Existing dashboards for reference

File Focus
service-overview.json RED metrics across all HTTP spans
prisma-postgres.json Prisma query rate, latency, errors, top-N
bullmq.json Queue depth, processing rate, failed jobs
redis.json Redis command rate, latency
browser-rum.json Frontend Web Vitals (LCP, FID, CLS)
logs-trace-pivot.json Loki → Jaeger trace correlation

You're done — test by...

# Open http://localhost:3003 (Grafana, login: admin/grafana)
# Navigate to Dashboards → find "ebit · Your Dashboard"
# Verify panels render data from the running ebit-api
# If panels show "No data", check:
#   1. ebit-api is running with OTEL_EXPORTER_OTLP_ENDPOINT set
#   2. OTel collector is running (docker compose logs otel-collector)
#   3. Prometheus is scraping the collector (http://localhost:9090/targets)