Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): Add Dgraph txn metrics (commits and discards). #7339

Merged
merged 13 commits into from
Jan 25, 2021

Conversation

danielmai
Copy link
Contributor

@danielmai danielmai commented Jan 19, 2021

In #6171 the metric to track txn aborts was added. This PR adds metrics for txn commits and txn discards:

  • dgraph_txn_commits_total
  • dgraph_txn_discards_total

dgraph_txn_commits_total is incremented on successful commits.

dgraph_txn_discards_total is incremented when the client calls txn.Discard. Calling txn.Discard is also considered a txn abort, so dgraph_txn_aborts_total is also incremented (that's already the behavior even before this change). As of #7365 dgraph_txn_discards_total tracks the number of client-called Discard calls. dgraph_txn_aborts_total tracks the number of txn aborts made by the server.

Changes to dgraph_txn metrics (including the existing dgraph_txn_aborts_total metric):

When the method tag is exported it's possible to get split metrics like
the following:

dgraph_txn_aborts_total{method="",status=""} 2
dgraph_txn_aborts_total{method="Server.Mutate",status=""} 1
dgraph_txn_commits_total{method="",status=""} 3
dgraph_txn_commits_total{method="Server.Mutate",status=""} 13
dgraph_txn_discards_total{method="",status=""} 1

When setting "CommitNow" in a mutation, the txn commit and abort metrics
are tracked under method="Server.Mutate". When the mutate and commit are
separate calls, the metric is recorded under method="".

This is confusing. It's clearer to bucket these metrics together, so for
the dgraph_txn metrics we don't set those tags in the stats view:

dgraph_txn_aborts_total 3
dgraph_txn_commits_total 16
dgraph_txn_discards_total 1

Fixes DGRAPH-1137


This change is Reviewable

When the method tag is exported it's possible to get split metrics like
the following:

    dgraph_txn_aborts_total{method="",status=""} 2
    dgraph_txn_aborts_total{method="Server.Mutate",status=""} 1
    dgraph_txn_commits_total{method="",status=""} 3
    dgraph_txn_commits_total{method="Server.Mutate",status=""} 13
    dgraph_txn_discards_total{method="",status=""} 1

When setting "CommitNow" in a mutation, the txn commit and abort metrics
are tracked under method="Server.Mutate". When the mutate and commit are
separate calls, the metric is recorded under method="".

This is confusing. It's clearer to bucket these metrics together, so for
the dgraph_txn metrics we don't set those tags in the stats view.
@netlify
Copy link

netlify bot commented Jan 25, 2021

Deploy preview for dgraph-docs ready!

Built with commit 71222ad

https://deploy-preview-7339--dgraph-docs.netlify.app

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 3 files at r8, 3 of 3 files at r10, 1 of 1 files at r11.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @danielmai and @vvbalaji-dgraph)


worker/draft.go, line 1770 at r11 (raw file):

func (n *node) monitorRaftMetrics() {
	ticker := time.NewTicker(5 * time.Second)

revert.


x/metrics.go, line 293 at r11 (raw file):

			TagKeys:     allTagKeys,
		},
    // Raft metrics

the comment is not aligned.

@karlmcguire karlmcguire merged commit 34a96aa into master Jan 25, 2021
@karlmcguire karlmcguire deleted the danielmai/metrics-txn-commit branch January 25, 2021 19:34
danielmai added a commit that referenced this pull request Jan 25, 2021
#7365)

This PR is a follow-up to #7339 to update the way
dgraph_txn_discards_total and dgraph_txn_aborts_total are
tracked. When the client calls Discard, only the
dgraph_txn_discards_total metric is incremented, not
dgraph_txn_aborts_total. This makes it easy to see:

1. If the client is calling Discard
2. If the server is aborting transactions (e.g., because of
   transaction conflicts)

Additional changes:

* Update metrics description text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants