Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

Commit

Permalink
Merge pull request #669 from yeya24/feature/add-metrics
Browse files Browse the repository at this point in the history
feature: add support for prometheus metrics
  • Loading branch information
allencloud authored Jul 18, 2019
2 parents a75da32 + 09594c5 commit ae9e9a3
Show file tree
Hide file tree
Showing 19 changed files with 411 additions and 15 deletions.
7 changes: 7 additions & 0 deletions dfdaemon/constant/constant.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,10 @@ const (
// DefaultConfigPath the default path of dfdaemon configuration file.
DefaultConfigPath = "/etc/dragonfly/dfdaemon.yml"
)

const (
// Namespace is the prefix of the metrics' name of dragonfly
Namespace = "dragonfly"
// Subsystem represents metrics for dfdaemon
Subsystem = "dfdaemon"
)
3 changes: 3 additions & 0 deletions dfdaemon/handler/root_handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ import (
_ "net/http/pprof"

"github.com/dragonflyoss/Dragonfly/version"

"github.com/prometheus/client_golang/prometheus/promhttp"
)

// New returns a new http mux for dfdaemon
Expand All @@ -30,5 +32,6 @@ func New() *http.ServeMux {
s.HandleFunc("/args", getArgs)
s.HandleFunc("/env", getEnv)
s.HandleFunc("/debug/version", version.Handler)
s.HandleFunc("/metrics", promhttp.Handler().ServeHTTP)
return s
}
5 changes: 4 additions & 1 deletion dfdaemon/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,10 @@ import (
"github.com/dragonflyoss/Dragonfly/dfdaemon/config"
"github.com/dragonflyoss/Dragonfly/dfdaemon/handler"
"github.com/dragonflyoss/Dragonfly/dfdaemon/proxy"
"github.com/sirupsen/logrus"
"github.com/dragonflyoss/Dragonfly/version"

"github.com/pkg/errors"
"github.com/sirupsen/logrus"
)

// Server represents the dfdaemon server
Expand Down Expand Up @@ -104,6 +105,8 @@ func (s *Server) Start() error {
} else {
logrus.Infof("start dfdaemon http server on %s", s.server.Addr)
}
// register dfdaemon build information
version.NewBuildInfo("dfdaemon")
return s.server.ListenAndServe()
}

Expand Down
Binary file added docs/images/dragonfly_metrics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 19 additions & 0 deletions docs/user_guide/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Prometheus Metrics

This doc contains all the metrics that Dragonfly components currently support. Now we only support metrics for Dfdaemon and SuperNode. And we will support dfget metrics in the future. For Dfdaemon and SuperNode, the metrics path is fixed to /metrics. The following metrics are exported.

## Supernode

- dragonfly_supernode_build_info{version, revision, goversion, arch, os} - build and version information of supernode
- dragonfly_supernode_http_requests_total{code, handler, method} - total number of http requests
- dragonfly_supernode_http_request_duration_seconds{code, handler, method} - http request latency in seconds
- dragonfly_supernode_http_request_size_bytes{code, handler, method} - http request size in bytes
- dragonfly_supernode_http_response_size_bytes{code, handler, method} - http response size in bytes

## Dfdaemon

- dragonfly_dfdaemon_build_info{version, revision, goversion, arch, os} - build and version information of dfdaemon

## Dfget

TODO
114 changes: 114 additions & 0 deletions docs/user_guide/monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Monitor Dragonfly with Prometheus

Currently metrics become an important part of observability. As for monitoring Dragonfly, we recommend you to use Prometheus.

In Dragonfly project, there are two long-running processes: supernode and dfdaemon. Each of the components expose its metrics via `/metrics` endpoint, so Prometheus can get metrics from each component. We will also support dfget metrics in the future. As for current metrics, you can check out [metrics](metrics.md).

## How to set up Prometheus

### Setup Dragonfly Environment

First, please ensure you know how to setup Dragonfly environment. If you don't, you can check out this [quick_start](https://github.com/dragonflyoss/Dragonfly/blob/master/docs/quick_start/README.md) docs first. Besides, building from source code is ok.

``` bash
make build
# start supernode and dfdaemon
bin/linux_amd64/supernode --advertise-ip 127.0.0.1
bin/linux_amd64/dfdaemon
```

When supernode and dfdaemon is running normally, you can check metrics through command line.

check dfdaemon metrics:

``` bash
~ curl localhost:65001/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 10
```

check supernode metrics:

``` bash
~ curl localhost:8002/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.2854e-05
go_gc_duration_seconds{quantile="0.25"} 0.000150952
go_gc_duration_seconds{quantile="0.5"} 0.000155267
go_gc_duration_seconds{quantile="0.75"} 0.000171251
go_gc_duration_seconds{quantile="1"} 0.00018524
go_gc_duration_seconds_sum 0.000685564
go_gc_duration_seconds_count 5
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 8
```

If you can get the results above, it means your Dragonfly components work well. Next, we will start to setup Prometheus.

### Download Prometheus

[Download the release of Prometheus](https://prometheus.io/download/) for your platform, then extract and run it. Here we take Linux version as an example:

``` bash
wget https://github.com/prometheus/prometheus/releases/download/v2.11.1/prometheus-2.11.1.linux-amd64.tar.gz
tar -xvf prometheus-2.11.1.linux-amd64.tar.gz
cd prometheus-2.11.1.linux-amd64
```

Before starting using Prometheus, we should configure Prometheus first.

### Configure Prometheus

Here we provide a minimal-configuration below for monitoring Dragonfly. As for more detailed configuration, you can check [Prometheus Configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/) for help.

```
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
scrape_configs:
- job_name: 'dragonfly'
static_configs:
- targets: ['localhost:8002', 'localhost:65001']
```

If you are not familiar with Prometheus, you can modify `prometheus.yml` to this configuration above. Here we don't use any alert rules and alertmanager, so these parts is unset. After modifying this file, you can validate it via `promtool`.

``` bash
./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: 0 rule files found
```

Finally you can start Prometheus in the same directory. If Prometheus works well, you can open your browser with localhost:9090 and see Prometheus web ui.

``` bash
./prometheus
```

### Get Dragonfly Metrics Using Prometheus

In Prometheus web ui, you can search Dragonfly metrics below. If you want to learn more about Prometheus query language, please check [promql](https://prometheus.io/docs/prometheus/latest/querying/basics/) for help.

![dragonfly_metrics.png](../images/dragonfly_metrics.png)
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ require (
github.com/pborman/uuid v0.0.0-20180122190007-c65b2f87fee3
github.com/pkg/errors v0.8.0
github.com/prashantv/gostub v1.0.0
github.com/prometheus/client_golang v0.9.3
github.com/russross/blackfriday v0.0.0-20171011182219-6d1ef893fcb0 // indirect
github.com/sirupsen/logrus v1.2.0
github.com/spf13/afero v1.2.2
Expand Down
7 changes: 7 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6/go.mod h1:grANhF5
github.com/asaskevich/govalidator v0.0.0-20170903095215-73945b6115bf h1:wXq5VXJjLole37O6oWZwqBRbKZw6VmC+wuAe8j/w2ZA=
github.com/asaskevich/govalidator v0.0.0-20170903095215-73945b6115bf/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY=
github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973/go.mod h1:Dwedo/Wpr24TaqPxmxbtue+5NUziq4I4S80YR8gNf3Q=
github.com/beorn7/perks v1.0.0 h1:HWo1m869IqiPhD389kmkxeTalrjNbbJTC8LXupb+sl0=
github.com/beorn7/perks v1.0.0/go.mod h1:KWe93zE9D1o94FZ5RNwFwVgaQK1VOXiVxmqh+CedLV8=
github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc=
github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
Expand Down Expand Up @@ -64,6 +65,7 @@ github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfb
github.com/golang/mock v1.3.1 h1:qGJ6qTW+x6xX/my+8YUVl4WNpX9B7+/l2tRsHGZ7f2s=
github.com/golang/mock v1.3.1/go.mod h1:sBzyDLLjw3U8JLTeZvSv8jJB+tU5PVekmnlKIyFUx0Y=
github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/golang/protobuf v1.3.1 h1:YF8+flBXS5eO826T4nzqPrxfhQThhXl0YzfuUPu4SBg=
github.com/golang/protobuf v1.3.1/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M=
Expand Down Expand Up @@ -100,6 +102,7 @@ github.com/magiconair/properties v1.8.1 h1:ZC2Vc7/ZFkGmsVC9KvOjumD+G5lXy2RtTKyzR
github.com/magiconair/properties v1.8.1/go.mod h1:PppfXfuXeibc/6YijjN8zIbojt8czPbwD3XqdrwzmxQ=
github.com/mailru/easyjson v0.0.0-20170902151237-2a92e673c9a6 h1:xhfqLjTK1g6iq92WjkfuaN6bC7Aoxb5//G8IfwyMyYA=
github.com/mailru/easyjson v0.0.0-20170902151237-2a92e673c9a6/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
github.com/matttproud/golang_protobuf_extensions v1.0.1 h1:4hp9jkHxhMHkqkrB3Ix0jegS5sx/RkqARlsWZ6pIwiU=
github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0=
github.com/mitchellh/mapstructure v1.1.2 h1:fmNYVwqnSfB9mZU6OS2O6GsXM+wcskZDuKQzvN1EDeE=
github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y=
Expand All @@ -116,12 +119,16 @@ github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZN
github.com/prashantv/gostub v1.0.0 h1:wTzvgO04xSS3gHuz6Vhuo0/kvWelyJxwNS0IRBPAwGY=
github.com/prashantv/gostub v1.0.0/go.mod h1:dP1v6T1QzyGJJKFocwAU0lSZKpfjstjH8TlhkEU0on0=
github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw=
github.com/prometheus/client_golang v0.9.3 h1:9iH4JKXLzFbOAdtqv/a+j8aewx2Y8lAjAydhbaScPF8=
github.com/prometheus/client_golang v0.9.3/go.mod h1:/TN21ttK/J9q6uSwhBd54HahCDft0ttaMvbicHlPoso=
github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:MbSGuTsp3dbXC40dX6PRTWyKYBIrTGTE9sqQNg2J8bo=
github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90 h1:S/YWwWx/RA8rT8tKFRuGUZhuA90OyIBpPCXkcbwU8DE=
github.com/prometheus/client_model v0.0.0-20190129233127-fd36f4220a90/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
github.com/prometheus/common v0.0.0-20181113130724-41aa239b4cce/go.mod h1:daVV7qP5qjZbuso7PdcryaAu0sAZbrN9i7WWcTMWvro=
github.com/prometheus/common v0.4.0 h1:7etb9YClo3a6HjLzfl6rIQaU+FDfi0VSX39io3aQ+DM=
github.com/prometheus/common v0.4.0/go.mod h1:TNfzLD0ON7rHzMJeJkieUDPYmFC7Snx/y86RQel1bk4=
github.com/prometheus/procfs v0.0.0-20181005140218-185b4288413d/go.mod h1:c3At6R/oaqEKCNdg8wHV1ftS6bRYblBhIjjI8uT2IGk=
github.com/prometheus/procfs v0.0.0-20190507164030-5867b95ac084 h1:sofwID9zm4tzrgykg80hfFph1mryUeLRsUfoocVVmRY=
github.com/prometheus/procfs v0.0.0-20190507164030-5867b95ac084/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsTZCD3I8kEA=
github.com/prometheus/tsdb v0.7.1/go.mod h1:qhTCs0VvXwvX/y3TZrWD7rabWM+ijKTux40TwIPHuXU=
github.com/rogpeppe/fastuuid v0.0.0-20150106093220-6724a57986af/go.mod h1:XWv6SoW27p1b0cqNHllgS5HIMJraePCO15w5zCzIWYg=
Expand Down
7 changes: 7 additions & 0 deletions supernode/config/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,10 @@ const (
// CDNWriterRoutineLimit 4
CDNWriterRoutineLimit = 4
)

const (
// Namespace is the prefix of the metrics' name of dragonfly
Namespace = "dragonfly"
// Subsystem represents metrics for supernode
Subsystem = "supernode"
)
82 changes: 82 additions & 0 deletions supernode/server/metrics.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
package server

import (
"net/http"

"github.com/dragonflyoss/Dragonfly/supernode/config"

"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/client_golang/prometheus/promhttp"
)

// metrics defines three prometheus metrics for monitoring http handler status
type metrics struct {
requestCounter *prometheus.CounterVec
requestDuration *prometheus.HistogramVec
requestSize *prometheus.HistogramVec
responseSize *prometheus.HistogramVec
}

func newMetrics() *metrics {
m := &metrics{
requestCounter: promauto.NewCounterVec(
prometheus.CounterOpts{
Namespace: config.Namespace,
Subsystem: config.Subsystem,
Name: "http_requests_total",
Help: "Counter of HTTP requests.",
},
[]string{"code", "handler", "method"},
),
requestDuration: promauto.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: config.Namespace,
Subsystem: config.Subsystem,
Name: "http_request_duration_seconds",
Help: "Histogram of latencies for HTTP requests.",
Buckets: []float64{.1, .2, .4, 1, 3, 8, 20, 60, 120},
},
[]string{"code", "handler", "method"},
),
requestSize: promauto.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: config.Namespace,
Subsystem: config.Subsystem,
Name: "http_request_size_bytes",
Help: "Histogram of request size for HTTP requests.",
Buckets: prometheus.ExponentialBuckets(100, 10, 8),
},
[]string{"code", "handler", "method"},
),
responseSize: promauto.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: config.Namespace,
Subsystem: config.Subsystem,
Name: "http_response_size_bytes",
Help: "Histogram of response size for HTTP requests.",
Buckets: prometheus.ExponentialBuckets(100, 10, 8),
},
[]string{"code", "handler", "method"},
),
}

return m
}

// instrumentHandler will update metrics for every http request
func (m *metrics) instrumentHandler(handlerName string, handler http.HandlerFunc) http.HandlerFunc {
return promhttp.InstrumentHandlerDuration(
m.requestDuration.MustCurryWith(prometheus.Labels{"handler": handlerName}),
promhttp.InstrumentHandlerCounter(
m.requestCounter.MustCurryWith(prometheus.Labels{"handler": handlerName}),
promhttp.InstrumentHandlerRequestSize(
m.requestSize.MustCurryWith(prometheus.Labels{"handler": handlerName}),
promhttp.InstrumentHandlerResponseSize(
m.responseSize.MustCurryWith(prometheus.Labels{"handler": handlerName}),
handler,
),
),
),
)
}
17 changes: 13 additions & 4 deletions supernode/server/router.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,16 @@ import (
"github.com/dragonflyoss/Dragonfly/version"

"github.com/gorilla/mux"
"github.com/prometheus/client_golang/prometheus/promhttp"
)

// versionMatcher defines to parse version url path.
const versionMatcher = "/v{version:[0-9.]+}"

var m = newMetrics()

func initRoute(s *Server) *mux.Router {
r := mux.NewRouter()

handlers := []*HandlerSpec{
// system
{Method: http.MethodGet, Path: "/_ping", HandlerFunc: s.ping},
Expand All @@ -35,13 +37,15 @@ func initRoute(s *Server) *mux.Router {
{Method: http.MethodDelete, Path: "/peers/{id}", HandlerFunc: s.deRegisterPeer},
{Method: http.MethodGet, Path: "/peers/{id}", HandlerFunc: s.getPeer},
{Method: http.MethodGet, Path: "/peers", HandlerFunc: s.listPeers},

{Method: http.MethodGet, Path: "/metrics", HandlerFunc: handleMetrics},
}

// register API
for _, h := range handlers {
if h != nil {
r.Path(versionMatcher + h.Path).Methods(h.Method).Handler(filter(h.HandlerFunc, s))
r.Path(h.Path).Methods(h.Method).Handler(filter(h.HandlerFunc, s))
r.Path(versionMatcher + h.Path).Methods(h.Method).Handler(m.instrumentHandler(h.Path, filter(h.HandlerFunc)))
r.Path(h.Path).Methods(h.Method).Handler(m.instrumentHandler(h.Path, filter(h.HandlerFunc)))
}
}

Expand All @@ -51,7 +55,12 @@ func initRoute(s *Server) *mux.Router {
return r
}

func filter(handler Handler, s *Server) http.HandlerFunc {
func handleMetrics(ctx context.Context, rw http.ResponseWriter, req *http.Request) (err error) {
promhttp.Handler().ServeHTTP(rw, req)
return nil
}

func filter(handler Handler) http.HandlerFunc {
pctx := context.Background()

return func(w http.ResponseWriter, req *http.Request) {
Expand Down
Loading

0 comments on commit ae9e9a3

Please sign in to comment.