Building a custom Prometheus exporter

Marcin Budny – Head of R&D
Head of R&D
Marcin Budny – Head of R&D

Architect and developer, over 10 years in IT. He works with intelligent transport systems, researches new technology and explores ways to apply it. Passionate about building software, always looking to learn something new. Focused on .NET, yet curious about the rest of the dev world.

 
In this post I’d like to guide you through the process of creating a custom Prometheus exporter in Go. For this article I assumed that you are building an open source exporter that you want to share with others.

When do I need a custom exporter?

A lot of components and services we use to build our systems can expose some kind of metrics.

  • If you use a message broker, it can probably tell you how many messages are waiting to be processed.
  • If you use a database, it probably exposes information about the current I/O load on the disk.

All of this is highly useful information when tracking the status of a distributed system or resolving problems.

If you would like to put this information into Prometheus and consume it with Grafana or some other visualization tool, the problem is that most of the components you use do not naturally support Prometheus protocol.

This is where the Prometheus exporters come in. The Prometheus exporter is essentially an adapter that allows Prometheus to understand metrics which have been exposed from things like databases, network appliances, message brokers, etc.

If you have a need for such an adapter, you should check the list of existing exporters. However, it is quite extensive!

  • The official list of the exporters can be found in the documentation.
  • There is also a semi-official list of port allocations for the exporters on Github. It even contains information on some of the exporters that are currently a work-in-progress.
  • Then of course there is the good old Google search for exporters that are not included on either of the two lists.

Please note, that you may not necessarily need an exporter that is specific to your component.

  • There are some exporters that use a more generic approach, like for example the JSON exporter that allows you to scrape any JSON API and extract useful information with JSONPath
  • and another example is the SNMP exporter.

If you still haven’t found the tool you need, it’s time to think about building a custom one!

Do I have to build my Prometheus exporter in Go?

The short answer is: no, you don’t. As a matter of fact, you can use any form of technology you like to create a custom Prometheus exporter.

Every significant platform has tools and libraries that support the development of Prometheus compatible applications. Even if there are no libraries available, the protocol is text based and quite simple. It should be easy enough to expose http endpoint and format the text representation of the metrics.

I’ve seen Prometheus exporter written in node.js and python. Also I myself am a .NET developer and I’ve built exporters using Microsoft’s technology.

So why should you consider using Go?

Here are several reasons:

  • Exporters are supporting tools, that you deploy as services into your infrastructure. You want them to be efficient and consume as little resources as possible. The Go’s small and self contained binaries fit very well with this requirement.
  • There is a very good tooling support and a large community around Prometheus and Go. Many existing exporters use Go, so you can use them as examples. Prometheus itself is built with this language.
  • It is always fun to learn something new :)

What is the overall idea?

It’s quite simple, really:

Where do I start?

To build a custom Prometheus exporter, follow these steps.

First, you need to know what data you want to export. Explore the diagnostic API of your target component to see what metrics you could possibly extract. Try to think about a general use case, not only your specific needs. This way the Prometheus exporter you build will be useful for other people.

Then prepare the initial structure of the project. I will be basing my description on the Azure Service Bus exporter I built some time ago. I won’t be explaining basics of the language or the package management. You can find plenty of tutorials on that online.

Disclaimer: I have already confessed that I’m not using Go in my day-to-day work. I may not be following some best practices from the Go ecosystem. If you spot something that can be improved, be sure to let me know!

My exporter consists of the following packages:

├ client
│  └ client.go
├ collector
│  └ collector.go
└ servicebus_exporter.go

The client.go package is responsible for grabbing metrics from the target component (here: Azure Service Bus). This package will be specific to your use case and you need to implement it accordingly. It should return the metrics in a structure, that can be easily processed by other parts of the exporter (see the Stats structure inside the file).

The collector.go package does the actual exporting of the metrics in the Prometheus format. It takes the metrics provided by the client package and puts them in the structures defined by the Prometheus client library for Go.

The collector will be called when Prometheus starts to scrape the metrics’ endpoint on the exporter.

The servicebus_exporter.go package puts everything together. It reads the configuration, runs the HTTP server and serves as the composition root.

As the last step of the preparation, you may want to go to the previously mentioned list of port allocations and grab a next free port number for yourself.

The idea behind this list is to avoid port number conflicts for scenarios where multiple exporters are deployed to a single host machine.

How do I structure my collector?

Let’s start with how you should NOT structure it.

First, we need to understand how the exporter scenario is different from the “normal” scenario (when you just instrument an application you own).

Here’s the flow of scraping your own, instrumented application:

  1. When the application starts, you create instances of the metrics that you will be collecting.
  2. The application does its work, for example it serves HTTP requests.
  3. Upon completing each request, you collect some metric about it and record them in the metrics instances. This is not synchronized in any way with the scraping process.
  4. At the moment the scrape happens, a snapshot of the current values of the metrics is rendered to the text format.

How is exporter flow different?

  1. Metrics collection is usually initiated when scraping starts.
  2. The collection may take some time if the target component is slow to respond.
  3. Metrics are rendered when the collection is complete.

In an instrumented application you can use globally shared metric instances. They are created once at the start of the application and updated at various moments in during its lifetime.

The scrape process just peeks at their current values. For the exporter, you want to create metric instances for each scrape so that simultaneous scrapes do not interfere with each other.

I have to admit that I made the mistake in another exporter I built (hopefully by the time you are reading this I will have fixed this already). One of the consequences, is that removed EventStore projections were still popping up in the metrics output.

So what you should do is implement the Collector interface. It has two methods:

  • Describe – called to get descriptors of the metrics provided by the collector. A descriptor contains metadata about the metric, but not the actual value.
  • Collect – called to get the metric values
type Collector struct {

    up *prometheus.Desc

    // ... declare some more descriptors here ...
}

func New(client *sb.ServiceBusClient, log *logrus.Logger) *Collector {
    return &Collector{

        up: prometheus.NewDesc("servicebus_up", "Whether the Azure ServiceBus scrape was successful", nil, nil),

        // ... initialize rest of the descriptors ...
        // ... do other initialization ...
    }
}

func (c *Collector) Describe(ch chan<- *prometheus.Desc) {
    ch <- c.up
    // ... describe other metrics ...
}

func (c *Collector) Collect(ch chan<- prometheus.Metric) {
    if stats, err := c.client.GetServiceBusStats(); err != nil {
        // client call failed, set the up metric value to 0
        ch <- prometheus.MustNewConstMetric(c.up, prometheus.GaugeValue, 0)

    } else {
        // client call succeeded, set the up metric value to 1
        ch <- prometheus.MustNewConstMetric(c.up, prometheus.GaugeValue, 1)

        // ... collect other metrics ...
    }
}

Notice the usage of MustNewConstMetric function to create a new, disposable instance of the metric. In this way we ensure that the metric instances will never be shared between separate scraping processes. Of course, MustNewConstMetric and NewDesc are both able to accept label definitions for metrics that need labels.

After implementing the collector, you need to register it in the Prometheus client library. Therefore in the main package you do something like:

coll := collector.New(client, log)
prometheus.MustRegister(coll)

What is the best way to dockerize the exporter?

Remember the self-contained binaries that Go creates? It turns out this enables us to use an empty base image and reduce the output size.

FROM golang:1.13-alpine as build

WORKDIR /build
COPY go.mod ./
COPY go.sum ./
RUN go mod download

COPY . ./
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -tags netgo -o app

FROM alpine:latest as certs
RUN apk --update add ca-certificates

FROM scratch
COPY --from=build /build/app /
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
EXPOSE 9580
ENTRYPOINT [ "/app" ]

We assume here that CGO is not used by the exporter or any of the libraries it depends on. The certificates part is optional. The resulting image is less than 20 MB in size.

Should I make my exporter configurable?

The answer is quite obvious, but what settings should be configurable and how?

Again, try to think about a generic scenario. You probably need connection settings for the target component (support different ways of connecting if it makes sense in your scenario), timeout configuration and logging verbosity.

When it comes to how the settings should be passed to the exporter, the bare minumum is the support for the environment variables. Command line flags are a nice addition.

What if it takes a long time to get statistics from the target component?

One example of such a case is when your metric value is a result of an expensive SQL query. You may be tempted to update the metrics outside of the scrape process, e.g. on timer, and serve the ready values when called on /metrics endpoint.

The guide on creating exporters explicitly states that all scraping should happen synchronously. You may consider strategies like caching or using the Pushgateway to deal with such scenarios.

What else do I need to remember?

Play by the rules. See the documentation on writing exporters to get some advice on metric naming, labeling and other aspects.

Make sure to provide reasonable documentation that specifies configuration settings and lists exported metrics.

Don’t forget to add a permissive license (like MIT) to your exporter, so that others can actually use it.

Feel free to share your thoughts on this article below in the comment section!


See our previous tech post: Interview with Huyen Tue Dao – android Google Developer expert