A practical setup for tracing API latency, server actions, and upstream dependencies in Next.js applications using OpenTelemetry.
Why teams miss incidents
Without distributed traces, teams see symptoms but not causal paths across API routes, databases, and third-party providers.
Minimal instrumentation setup
ts
import { NodeSDK } from "@opentelemetry/sdk-node"
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node"
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http"
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({ url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT }),
instrumentations: [getNodeAutoInstrumentations()],
})
sdk.start()Alerting thresholds we recommend
- p95 API latency over 1200ms for 5 minutes
- error rate above 1.5% on checkout and payment flows
- sustained saturation above 75% on critical database pools
Incident triage flow
Diagram (Mermaid)
Final takeaway
Observability is most useful when tied to business-critical journeys, not vanity dashboards.