(Updated: )

An in-depth guide to monitoring Next.js apps with OpenTelemetry

Share on social

Table of contents

This guide goes into the fundamentals, practical applications and tips & tricks of using OpenTelemetry (OTel) to monitor your Next.js application.

OpenTelemetry is gaining (a lot) of momentum outside of its historical niche of distributed, micro services based application stacks. But, as it turns out you can just as well use it for more traditional, three tiered, web applications and it comes with a host of benefits.

A brief intro to OpenTelemetry

There are many great resources explaining what OTel is, how it came to be and what its purposes are. Here is quick recap:

OpenTelemetry is an open-source observability (o11y) framework for cloud-native software. OTel gives you a collection of tools, APIs, and SDKs to instrument, generate, collect, and export telemetry data — metrics, logs, and traces — for analysis in order to understand your apps’ and infrastructure’s performance and behavior.

Basic OpenTelemetry architecture

Three key features of OpenTelemetry are:

  1. Language-agnostic implementation. Note that not all SDKs support all of the telemetry types yet.
  2. Support for distributed tracing, e.g. from your browser, to your API backend to databases, queues etc.
  3. Automatic instrumentation for many popular frameworks and libraries, Next.js being one of them.

Installing the OpenTelemetry SDK in your app

The most foolproof way of installing OTel in a Next app is using the fairly recent @vercel/otel wrapper package which has some ✨ magic ✨ in dealing with the following:

  1. It recognises if your Next app is running in a Node.js environment or Edge environment. This is pretty specific to Vercel. You might not need this when deploying your app to a standard Node server. For instance, Node.js standard modules like stream are not available in the Edge environment.
  2. It replaces the standard OTel exporter package with a custom rewrite that is compatible with Vercel’s Edge runtime and also takes up less space MB wise.

Note that all code examples here use Next.js 14.x. OTel instrumentation is available in Next.js version 13.4 and higher.

1. Enable the instrumentation hook

First, you need to enabled the the Next.js specific OTel instrumentation by setting the instrumentationHook property to true in your next.config.js|ts file.

next.config.js
/** @type {import('next').NextConfig} */
const nextConfig = { 
  experimental: { 
    instrumentationHook: true 
  }
}

module.exports = nextConfig

2a. Install the @vercel/otel package

Now install the relevant package.

npm install --save @vercel/otel @opentelemetry/api 

And create a file in the root of your project called instrumentation.ts|js.

instrumentation.ts
import { registerOTel } from '@vercel/otel'

export function register() {
  registerOTel('my-next-app')
}

The example above is the most simple way to get started. We will expand on it a little bit later to add some more customization.

What’s instrumented, and what’s not?

This configuration auto-instruments all the basic HTTP handlers for page routes and API routes and emits traces with Next.js and /or Vercel specific tags. For instance, you might find a trace with the following properties:

  • next.route: /
  • next.span_name: render route (app) /
  • next.span_type: AppRender.getBodyResult
  • operation.name: next_js.AppRender.getBodyResult
  • vercel.runtime: nodejs

The following is NOT instrumented:

  1. The official JS/TS OTel SDK does not support logging yet as of the time of writing.
  2. This configuration only instruments the server-side. No traces or metrics are recorded for any browser-side interactions.
  3. The configuration only records traces by default. You can add metrics though with a a custom setup.

Local workflow

Ok, instrumentation code done. This is where you normally hit npm run dev and see if things work locally before pushing commits. I recommend adding the the OTEL_LOG_LEVEL=debug environment variable in the beginning so you can make sure all things are wired up correctly.

OTEL_LOG_LEVEL=debug npm run dev

> nextjs@0.1.0 dev
> next dev

Next.js 14.0.4
   - Local: http://localhost:3000
   - Experiments (use at your own risk):
     · instrumentationHook

Compiled /instrumentation in 214ms (40 modules)
@opentelemetry/api: Registered a global for diag v1.9.0.
@vercel/otel: Configure propagator: tracecontext
@vercel/otel: Configure propagator: baggage
@vercel/otel: Configure sampler:  parentbased_always_on
@vercel/otel: Configure trace exporter:  http/protobuf http://localhost:4318/v1/traces headers: <none>
@vercel/otel/otlp: onInit
@opentelemetry/api: Registered a global for trace v1.9.0.
@opentelemetry/api: Registered a global for context v1.9.0.
@opentelemetry/api: Registered a global for propagation v1.9.0.
@vercel/otel: Configure instrumentations: fetch undefined
@vercel/otel: started my-next-app nodejs

Open your app on your http://localhost/ and click around. You will notice two things:

  1. Your console should get filled up with OTel debug messages.
  2. You’s see a line indicating that traces are exported to http://localhost:4318. This means you can conveniently test your OTel instrumentation locally. All you have to do is spin up a local Docker container with some OTel tools so you can debug. Vercel actually provide the useful repo https://github.com/vercel/opentelemetry-collector-dev-setup for this. Just clone it and run it.
git clone https://github.com/vercel/opentelemetry-collector-dev-setup
cd opentelemetry-collector-dev-setup
docker-compose up

This should spin up some containers. We are mostly interested in the otel-collector and jaeger-all-in-one containers. Now open the local Jaeger web UI on http://localhost:16686/ and you should find your Next app and some collected traces.

Adding more (auto) instrumentation

Your app is probably connecting to some other backend services like Postgres, MySQL, Redis or using a popular libraries like Prisma or the AWS SDK. We can instrument these package and generate traces for them using the @opentelemetry/auto-instrumentations-node package.

However, this is where we need to reconfigure our Nextjs app a bit, as these instrumentations are not compatible with the Vercel edge runtime. If you are not using Vercel to host your app, you can blissfully ignore this extra config.

First we install the package:

npm install --save @opentelemetry/auto-instrumentations-node \
@opentelemetry/api \
@opentelemetry/exporter-trace-otlp-http \
@opentelemetry/sdk-trace-node \
@opentelemetry/resource \
@opentelemetry/semantic-conventions

We then create a dedicated instrumentation.node.ts file and configure it as follows:

instrumentation.node.ts
import { NodeSDK } from '@opentelemetry/sdk-node'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
import { Resource } from '@opentelemetry/resources'
import { SEMRESATTRS_SERVICE_NAME } from '@opentelemetry/semantic-conventions'
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-node'
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node'


const sdk = new NodeSDK({
  resource: new Resource({
    [SEMRESATTRS_SERVICE_NAME]: 'my-next-app',
  }),
  instrumentations: [getNodeAutoInstrumentations()],
  spanProcessors: [new BatchSpanProcessor(new OTLPTraceExporter())],
})
sdk.start()

And we modify the instrumentation.ts file we already have to only import this file if we are not running on the nodejs runtime.

instrumentation.ts
export async function register() {
  if (process.env.NEXT_RUNTIME === 'nodejs') {
    await import('./instrumentation.node.ts')
  }
}

If you restart your Next app now and visit it on your localhost, you will see A LOT of extra spans and traces. Probably way too much to make sense of. This is because we are now instrumentation all file system reads, network socket creations and a bunch of other things. You might want to keep them, or remove them for now. Keep reading 👇

Removing noise from your instrumentation

I recommended adding the following exclusions to the auto instrumentation configuration. It pretty simple. Just pass in an object with the following structure to the getNodeAutoInstrumentations() function and set enabled to false.

instrumentations: [getNodeAutoInstrumentations({
  '@opentelemetry/instrumentation-fs': {
    enabled: false,
  },
  '@opentelemetry/instrumentation-net': {
    enabled: false,
  },
  '@opentelemetry/instrumentation-dns': {
    enabled: false,
  },
  '@opentelemetry/instrumentation-http': {
    enabled: true,
  },
})],

Custom instrumentation

Automatic Instrumentation is super useful, but you might want to track some metric, function call or 3rd party API call explicitly. So let’s create a custom trace and test it locally.

Have a look at the basic page.tsx example below. It should be reminiscent of a typical async fetching of some JSON data from some backend.

page.tsx
import { trace } from '@opentelemetry/api'
export const dynamic = 'force-dynamic'

async function fetchData() {

  let res: any
  await trace
    .getTracer('nextjs-server')
    .startActiveSpan('fetchJsonPlaceholder', async (span) => {
      try {
        res = await fetch("https://jsonplaceholder.typicode.com/posts");
      } finally {
        span.addEvent('fetchJsonPlaceholder was called', {
          provider: 'jsonplaceholder',
          someKey: 'someValue',
        })
        span.end()
      }
    })


  if (!res.ok) {
    throw new Error("Failed to fetch data")
  }

  return res.json()
}

export default async function SimpleSSRComponent() {
  const data = await fetchData()
  return (
    <div>
      <ul>
        {data.map((item: any) => (
          <li style={{ marginBottom: "20px" }} key={item.id}>
            {" "}
            <b>
              {item.id}. {item.title}
            </b>
            <p>{item.body}</p>
          </li>
        ))}
      </ul>
    </div>
  )
}

There are a couple of things going on:

  1. We are fetching a list of placeholder text from a 3rd party service in an async function called fetchData() .
  2. We want to explicitly trace this call to this 3rd party service. We do this by importing the global trace object and starting and ending a custom span.
  3. The startActiveSpan() function wraps the actual fetching of the 3rd party JSON.
  4. We are also adding an Event to the span. Events are standard “markers” on OpenTelemetry traces to indicate some event happening during the recording of a span.
  5. We are rendering the resulting JSON to a simple list.

Running the above code should result in a trace and span similar to this. Note that events are called "logs" in Jaeger 🤷

Jaeger showing a trace, span and its events

Deploying to production

We’ve been working locally up till now, and one of the great benefits of OTel is that you actually CAN fully configure, instrument and debug it completely on your own machine.

Getting this into production typically requires at least these two steps.

1. Keeping cost down by head sampling on prod

Sending and storing OTel data (or any o11y data really) can get very expensive if your app has a lot of traffic or you are just generating a lot of traces. It can also get really noisy.

However, on our local development environment we actually DO want to keep all traces. We can do this by adding two sampling strategies.

  1. Always on: just sample everything.
  2. Ratio based: only sample a percentage of traces.

Modify your instrumentation.ts as follows. Note we import the AlwaysOnSampler and the TraceIdRatioBasedSampler . We are using the ratio 0.1 here, so 10% of traces are sampled.

Using the default Vercel wrapper, this looks as follows:

instrumentation.ts
import { registerOTel } from '@vercel/otel'
import { AlwaysOnSampler, TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-node'
export function register() {
  registerOTel({
    serviceName: 'my-next-app',
    traceSampler: process.env.NODE_ENV === 'development'
      ? new AlwaysOnSampler()
      : new TraceIdRatioBasedSampler(0.1)
  })
}

If you added more customization in the instrumentation.node.ts file it looks as follows:

instrumentation.node.ts
...
import { BatchSpanProcessor, AlwaysOnSampler, TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-node'
...

const sdk = new NodeSDK({
	...
  sampler: process.env.NODE_ENV === 'development'
    ? new AlwaysOnSampler()
    : new TraceIdRatioBasedSampler(0.1)
})

2. Sending data to a 3rd party backend

As you deploy your app to production, you will need an OTel compatible backend to export your data to. There are many players out there on the market, but they all have two things in common:

  1. You send your data to some URL, typically on the standard OTel ports.
  2. You need to configure an API key in some header.

You can configure this in your config file, but “the OTel way of doing things” is to use environment variables for this. I prefer using the environment variable way of doing things as we explicitly want to leverage the defaults in our local environment. So after picking a vendor, or maybe a self-hosted solution, export the following headers:

export OTEL_EXPORTER_OTLP_ENDPOINT="<endpoint>"
export OTEL_EXPORTER_OTLP_HEADERS="<header>=<your-api-key>"

Bonus: using Checkly Tracing

You can get started with tracing and OpenTelemetry on Checkly and get backend traces for all your synthetic monitoring checks. This works out-of-the-box with any Nextjs project. With Checkly Traces, you will have access to traces in all the places where it matters to more quickly resolve issues:

  • Check results: resolve production outages faster by correlating failing checks with backend traces.
  • Test sessions: understand any failures during test session execution.
  • Check Editors get a live trace while building, editing and debugging check code.
Checkly traces result screen
  1. Sign up for a free Checkly account as https://app.checklyhq.com.
  2. Check our docs on how to instrument your app and you should be tracing in 2 minutes.

Share on social