Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): Add Raft leadership metrics. #7338

Merged
merged 17 commits into from
Jan 25, 2021
Merged

Conversation

danielmai
Copy link
Contributor

@danielmai danielmai commented Jan 19, 2021

This adds Raft leadership metrics for both Alphas and Zeros.

  • dgraph_raft_has_leader
  • dgraph_raft_is_leader
  • dgraph_raft_leader_changes_total

Changes:

  • Add group label to each Raft metric. For example, this makes it easy to see from the metric dgraph_raft_is_leader{group="1",instance="alpha1:8180"} 1 that alpha1 is the leader of group 1.

This change is Reviewable

Copy link
Contributor Author

@danielmai danielmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 4 unresolved discussions (waiting on @manishrjain and @vvbalaji-dgraph)


dgraph/cmd/zero/raft.go, line 870 at r2 (raw file):

				}
				if rd.SoftState.Lead != raft.None {
					ostats.Record(context.Background(), x.RaftHasLeader.M(1))

Add group="0" tags for Zero too, so PromQL queries can find Raft metrics based on group 0 easily. e.g.,

dgraph_raft_leader_changes_seen_total{group="0"}
dgraph_raft_leader_changes_seen_total{group="1"}
dgraph_raft_leader_changes_seen_total{group="2"}
...

worker/draft.go, line 1063 at r2 (raw file):

	// lastLead is for detecting leadership changes
	lastLead := uint64(math.MaxUint64)

You can already get the current leadership information and compare that to what's received in the soft state. You can see how the leaderChanges metric is tracked in etcd raft: https://github.com/etcd-io/etcd/blob/01844fd2856016c488fd0f8974252a0070f277ae/server/etcdserver/raft.go#L178-L181


worker/draft_test.go, line 118 at r2 (raw file):

}

func TestRaftMetrics(t *testing.T) {

You can remove this test and add the metrics to TestMetrics, so that we can verify that the metrics are getting exposed properly with the correct names.


x/metrics.go, line 121 at r2 (raw file):

RaftLeaderChangesSeenTotal

Rename this to RaftLeaderChanges. And rename the metric to raft_leader_changes_total.

@karlmcguire
Copy link
Contributor

I looked around for something similar to etcd's raftReadyHandler.getLeader() function that stores the "current" or "previous" leader, but we don't have anything like that. So, I left the lastLeader variable in the function and added a note. Let me know if that's not sufficient.

Copy link
Contributor Author

@danielmai danielmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @manishrjain and @vvbalaji-dgraph)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants