This repository has been archived by the owner on Dec 20, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 773
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #669 from yeya24/feature/add-metrics
feature: add support for prometheus metrics
- Loading branch information
Showing
19 changed files
with
411 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Prometheus Metrics | ||
|
||
This doc contains all the metrics that Dragonfly components currently support. Now we only support metrics for Dfdaemon and SuperNode. And we will support dfget metrics in the future. For Dfdaemon and SuperNode, the metrics path is fixed to /metrics. The following metrics are exported. | ||
|
||
## Supernode | ||
|
||
- dragonfly_supernode_build_info{version, revision, goversion, arch, os} - build and version information of supernode | ||
- dragonfly_supernode_http_requests_total{code, handler, method} - total number of http requests | ||
- dragonfly_supernode_http_request_duration_seconds{code, handler, method} - http request latency in seconds | ||
- dragonfly_supernode_http_request_size_bytes{code, handler, method} - http request size in bytes | ||
- dragonfly_supernode_http_response_size_bytes{code, handler, method} - http response size in bytes | ||
|
||
## Dfdaemon | ||
|
||
- dragonfly_dfdaemon_build_info{version, revision, goversion, arch, os} - build and version information of dfdaemon | ||
|
||
## Dfget | ||
|
||
TODO |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
# Monitor Dragonfly with Prometheus | ||
|
||
Currently metrics become an important part of observability. As for monitoring Dragonfly, we recommend you to use Prometheus. | ||
|
||
In Dragonfly project, there are two long-running processes: supernode and dfdaemon. Each of the components expose its metrics via `/metrics` endpoint, so Prometheus can get metrics from each component. We will also support dfget metrics in the future. As for current metrics, you can check out [metrics](metrics.md). | ||
|
||
## How to set up Prometheus | ||
|
||
### Setup Dragonfly Environment | ||
|
||
First, please ensure you know how to setup Dragonfly environment. If you don't, you can check out this [quick_start](https://github.com/dragonflyoss/Dragonfly/blob/master/docs/quick_start/README.md) docs first. Besides, building from source code is ok. | ||
|
||
``` bash | ||
make build | ||
# start supernode and dfdaemon | ||
bin/linux_amd64/supernode --advertise-ip 127.0.0.1 | ||
bin/linux_amd64/dfdaemon | ||
``` | ||
|
||
When supernode and dfdaemon is running normally, you can check metrics through command line. | ||
|
||
check dfdaemon metrics: | ||
|
||
``` bash | ||
➜ ~ curl localhost:65001/metrics | ||
# HELP go_gc_duration_seconds A summary of the GC invocation durations. | ||
# TYPE go_gc_duration_seconds summary | ||
go_gc_duration_seconds{quantile="0"} 0 | ||
go_gc_duration_seconds{quantile="0.25"} 0 | ||
go_gc_duration_seconds{quantile="0.5"} 0 | ||
go_gc_duration_seconds{quantile="0.75"} 0 | ||
go_gc_duration_seconds{quantile="1"} 0 | ||
go_gc_duration_seconds_sum 0 | ||
go_gc_duration_seconds_count 0 | ||
# HELP go_goroutines Number of goroutines that currently exist. | ||
# TYPE go_goroutines gauge | ||
go_goroutines 10 | ||
``` | ||
|
||
check supernode metrics: | ||
|
||
``` bash | ||
➜ ~ curl localhost:8002/metrics | ||
# HELP go_gc_duration_seconds A summary of the GC invocation durations. | ||
# TYPE go_gc_duration_seconds summary | ||
go_gc_duration_seconds{quantile="0"} 2.2854e-05 | ||
go_gc_duration_seconds{quantile="0.25"} 0.000150952 | ||
go_gc_duration_seconds{quantile="0.5"} 0.000155267 | ||
go_gc_duration_seconds{quantile="0.75"} 0.000171251 | ||
go_gc_duration_seconds{quantile="1"} 0.00018524 | ||
go_gc_duration_seconds_sum 0.000685564 | ||
go_gc_duration_seconds_count 5 | ||
# HELP go_goroutines Number of goroutines that currently exist. | ||
# TYPE go_goroutines gauge | ||
go_goroutines 8 | ||
``` | ||
|
||
If you can get the results above, it means your Dragonfly components work well. Next, we will start to setup Prometheus. | ||
|
||
### Download Prometheus | ||
|
||
[Download the release of Prometheus](https://prometheus.io/download/) for your platform, then extract and run it. Here we take Linux version as an example: | ||
|
||
``` bash | ||
wget https://github.com/prometheus/prometheus/releases/download/v2.11.1/prometheus-2.11.1.linux-amd64.tar.gz | ||
tar -xvf prometheus-2.11.1.linux-amd64.tar.gz | ||
cd prometheus-2.11.1.linux-amd64 | ||
``` | ||
|
||
Before starting using Prometheus, we should configure Prometheus first. | ||
|
||
### Configure Prometheus | ||
|
||
Here we provide a minimal-configuration below for monitoring Dragonfly. As for more detailed configuration, you can check [Prometheus Configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/) for help. | ||
|
||
``` | ||
global: | ||
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. | ||
alerting: | ||
alertmanagers: | ||
- static_configs: | ||
- targets: | ||
# - alertmanager:9093 | ||
rule_files: | ||
# - "first_rules.yml" | ||
# - "second_rules.yml" | ||
scrape_configs: | ||
- job_name: 'dragonfly' | ||
static_configs: | ||
- targets: ['localhost:8002', 'localhost:65001'] | ||
``` | ||
|
||
If you are not familiar with Prometheus, you can modify `prometheus.yml` to this configuration above. Here we don't use any alert rules and alertmanager, so these parts is unset. After modifying this file, you can validate it via `promtool`. | ||
|
||
``` bash | ||
./promtool check config prometheus.yml | ||
Checking prometheus.yml | ||
SUCCESS: 0 rule files found | ||
``` | ||
|
||
Finally you can start Prometheus in the same directory. If Prometheus works well, you can open your browser with localhost:9090 and see Prometheus web ui. | ||
|
||
``` bash | ||
./prometheus | ||
``` | ||
|
||
### Get Dragonfly Metrics Using Prometheus | ||
|
||
In Prometheus web ui, you can search Dragonfly metrics below. If you want to learn more about Prometheus query language, please check [promql](https://prometheus.io/docs/prometheus/latest/querying/basics/) for help. | ||
|
||
![dragonfly_metrics.png](../images/dragonfly_metrics.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
package server | ||
|
||
import ( | ||
"net/http" | ||
|
||
"github.com/dragonflyoss/Dragonfly/supernode/config" | ||
|
||
"github.com/prometheus/client_golang/prometheus" | ||
"github.com/prometheus/client_golang/prometheus/promauto" | ||
"github.com/prometheus/client_golang/prometheus/promhttp" | ||
) | ||
|
||
// metrics defines three prometheus metrics for monitoring http handler status | ||
type metrics struct { | ||
requestCounter *prometheus.CounterVec | ||
requestDuration *prometheus.HistogramVec | ||
requestSize *prometheus.HistogramVec | ||
responseSize *prometheus.HistogramVec | ||
} | ||
|
||
func newMetrics() *metrics { | ||
m := &metrics{ | ||
requestCounter: promauto.NewCounterVec( | ||
prometheus.CounterOpts{ | ||
Namespace: config.Namespace, | ||
Subsystem: config.Subsystem, | ||
Name: "http_requests_total", | ||
Help: "Counter of HTTP requests.", | ||
}, | ||
[]string{"code", "handler", "method"}, | ||
), | ||
requestDuration: promauto.NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Namespace: config.Namespace, | ||
Subsystem: config.Subsystem, | ||
Name: "http_request_duration_seconds", | ||
Help: "Histogram of latencies for HTTP requests.", | ||
Buckets: []float64{.1, .2, .4, 1, 3, 8, 20, 60, 120}, | ||
}, | ||
[]string{"code", "handler", "method"}, | ||
), | ||
requestSize: promauto.NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Namespace: config.Namespace, | ||
Subsystem: config.Subsystem, | ||
Name: "http_request_size_bytes", | ||
Help: "Histogram of request size for HTTP requests.", | ||
Buckets: prometheus.ExponentialBuckets(100, 10, 8), | ||
}, | ||
[]string{"code", "handler", "method"}, | ||
), | ||
responseSize: promauto.NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Namespace: config.Namespace, | ||
Subsystem: config.Subsystem, | ||
Name: "http_response_size_bytes", | ||
Help: "Histogram of response size for HTTP requests.", | ||
Buckets: prometheus.ExponentialBuckets(100, 10, 8), | ||
}, | ||
[]string{"code", "handler", "method"}, | ||
), | ||
} | ||
|
||
return m | ||
} | ||
|
||
// instrumentHandler will update metrics for every http request | ||
func (m *metrics) instrumentHandler(handlerName string, handler http.HandlerFunc) http.HandlerFunc { | ||
return promhttp.InstrumentHandlerDuration( | ||
m.requestDuration.MustCurryWith(prometheus.Labels{"handler": handlerName}), | ||
promhttp.InstrumentHandlerCounter( | ||
m.requestCounter.MustCurryWith(prometheus.Labels{"handler": handlerName}), | ||
promhttp.InstrumentHandlerRequestSize( | ||
m.requestSize.MustCurryWith(prometheus.Labels{"handler": handlerName}), | ||
promhttp.InstrumentHandlerResponseSize( | ||
m.responseSize.MustCurryWith(prometheus.Labels{"handler": handlerName}), | ||
handler, | ||
), | ||
), | ||
), | ||
) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.