SPIRE Servers and Agents can emit telemetry to track activities such as Server and Agent startup time, registration entry creation, X.509 SVID creation, and successful authentication. This Docker Compose-based tutorial demonstrates how to configure the SPIRE Server and the SPIRE Agent to emit telemetry to supported metrics collectors. The tutorial shows two metrics collectors, StatsD and Prometheus, receiving metrics from a SPIRE deployment. Metrics collected by StatsD are displayed using Graphite.
In this tutorial you will learn how to:
- Configure SPIRE Server and SPIRE Agent to emit telemetry
- Configure StatsD as a metrics collector
- Configure Prometheus as a metrics collector
Required files for this tutorial can be found in the docker-compose/metrics
directory in https://github.com/spiffe/spire-tutorials. If you didn't already clone the repository please do so now.
Before proceeding, review the following system requirements:
- A 64-bit Linux or macOS environment
- Docker and Docker Compose installed (Docker Compose is included in macOS Docker Desktop)
The SPIRE Server and Agent can be configured to emit telemetry by using a dedicated telemetry { ... }
section in their configuration files. Currently, SPIRE supports Prometheus, StatsD, DogStatsD, M3 and In-Memory as metrics collectors. In this tutorial we'll show how to configure Prometheus and StatsD but simple configuration examples for the other collectors can be found in the telemetry docs of the SPIRE project.
Note: The configuration changes needed to enable Prometheus and StatsD metrics collection from SPIRE are shown as snippets in this tutorial. However, all of these settings have already been configured. You don't have to edit any configuration files.
The telemetry
section supports the configuration of multiple collectors and for some collectors it is also possible to declare multiple instances.
The following snippet is from the SPIRE Server configuration file. The SPIRE Agent configuration file is configured the same way.
telemetry {
Prometheus {
host = "spire-server"
port = 8088
}
Statsd = [
{
address = "graphite-statsd:8125"
},
]
}
The first collector configured is Prometheus. Its configuration accepts two properties, the Prometheus server host which defaults to localhost
and the Prometheus server port. These values are used by SPIRE to expose an endpoint which will be used by Prometheus to pull the metrics.
For the purpose of this tutorial we configured the host property using the hostname of the SPIRE Server (and Agent) but be aware that this configuration, which allows the SPIRE Server and Agent to listen for remote network connections, creates a security risk to SPIRE due to the open port. When applying a configuration like this in a production environment, the access to the endpoint should be tightly controlled.
This configuration generates a warning message in the logs to alert the operator about this risk:
level=warning msg="Agent is now configured to accept remote network connections for Prometheus stats collection. Please ensure access to this port is tightly controlled." subsystem_name=telemetry
The second collector configured is StatsD, which is one of the collectors that supports the configuration of multiple instances. For that reason, the configuration object expects a list of addresses. For this tutorial we define only one instance. The address configured matches the StatsD instance running on the environment. We will see the details about this instance in a following section but for now it's worth noting that the address is formed by the hostname of the service and the default port for StatsD.
By configuring the address, SPIRE will be pushing metrics to the StatsD collector.
We use the official Docker image for Graphite and StatsD. This image already contains all the services necessary to collect and display metrics. For this tutorial we map the port 80
that belongs to the nginx proxy that reverse proxies the Graphite dashboard and the port 8125
where StatsD listens by default to the same external ports of 80 and 8125, respectively.
The graphite-statsd
service definition is:
graphite-statsd:
image: graphiteapp/graphite-statsd:1.1.7-6
container_name: graphite
hostname: graphite-statsd
restart: always
ports:
- "80:80"
- "8125:8125/udp"
The StatsD service will be available at graphite-statsd:8125
as configured for SPIRE Server in the previous section.
Due to the pull nature of Prometheus we need to configure the HTTP endpoint where it will scrape the metrics. We've already configured SPIRE to expose the HTTP endpoint via the telemetry configuration so now we need to indicate to Prometheus that it should collect metrics from that endpoint. We achieve this by setting the target
option to the hostname of the SPIRE Server (or SPIRE Agent) and the correct port number (e.g. 8088 for the SPIRE Server and 8089 for the SPIRE Agent).
By default the HTTP resource path to fetch metrics from targets is /metrics
but SPIRE does not expose metrics on that path. Instead, it does on the /
path. These configurations are part of the prometheus.yml configuration file.
scrape_configs:
- job_name: 'spire-server'
metrics_path: '/'
static_configs:
- targets: ['spire-server:8088']
- job_name: 'spire-agent'
metrics_path: '/'
static_configs:
- targets: ['spire-agent:8089']
To run Prometheus we use the official Docker image and we mount the local directory prometheus
to make the prometheus.yml configuration file available at the container.
prometheus:
image: prom/prometheus:v2.20.1
container_name: prometheus
hostname: prometheus
restart: always
volumes:
- ./prometheus:/etc/prometheus
ports:
- "9090:9090"
Use the set-env.sh
script to run all the services that make up the scenario. The script starts the SPIRE Server, SPIRE Agent, Graphite-StatsD and Prometheus services.
Ensure that the current working directory is .../spire-tutorials/docker-compose/metrics/
and run:
$ bash scripts/set-env.sh
Once the script is completed, in another terminal run the following command to review the logs from all the services:
$ docker-compose logs -f -t
Let's see some real data. Open your browser and navigate to http://localhost/
to see the Graphite web UI and, on a different tab, navigate to http://localhost:9090/
to access the Prometheus web UI.
To generate some data, let's create a workload registration entry using the following script:
$ bash scripts/create-workload-registration-entry.sh
And with this other script we perform requests to fetch an SVID for that new workload. These requests will serve to generate some metrics.
$ bash scripts/fetch_svid.sh
Wait a couple of minutes while metrics are collected and then you can create graphs to review them.
There are different metrics exported by SPIRE that can be analyzed. As an example, the following images show a graph of the remaining TTL of each SVID fetched.
The graph using Graphite
The same metric but this time shown using Prometheus UI
When you are finished running this tutorial, you can use the following Bash script to stop all the containers:
$ bash scripts/clean-env.sh