-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build/cmd/gerritbot: flag common mistakes in GitHub PRs like no word wrap, wrong bug format, using markdown, ... #61573
Comments
There is some potential overlap with another CL I sent (CL 509135 for #61316), but to start, I suggest we discuss the two issues separately. (Later, we can resolve whether/how to combine the message from CL 509135 with this message, vs. if it is better to have separate messages, but might be good to first establish if there is any interest in proceeding with this issue). Also, there is some overlap with what could eventually be done with Tricium, but:
|
Change https://go.dev/cl/513397 mentions this issue: |
There is now an initial implementation in https://go.dev/cl/513397 (with a few small tweaks such that the CL hit % numbers are slightly different in the CL vs. the opening comment here). |
Change https://go.dev/cl/518679 mentions this issue: |
Change https://go.dev/cl/519796 mentions this issue: |
Use the more modern "<project>~<changeNumber>" format for a change and properly specify the first revision. While we are here, we add a few comments to the gerrit package to help avoid confusion in the future. This is a follow-up to CL 513397, which was failing to post the new PR rule findings for a test PR with error: 2023/08/15 22:06:38 processPullRequest: importGerritChangeFromPR(#71, nil): could not add findings comment to CL for #71: HTTP status 404 Not Found on request to https://go-review.googlesource.com/a/changes/If1a8ae9e4b76a05b13139ddf9fda1cdf67b50b33/revisions/build~master~If1a8ae9e4b76a05b13139ddf9fda1cdf67b50b33/review; Not found: build~master~If1a8ae9e4b76a05b13139ddf9fda1cdf67b50b33 Updates golang/go#61573 Change-Id: Ib0fef88ae05b10e623c2c1af614aa932dbfcc043 Reviewed-on: https://go-review.googlesource.com/c/build/+/519796 Reviewed-by: Dmitri Shuralyov <[email protected]> Auto-Submit: Dmitri Shuralyov <[email protected]> Reviewed-by: Heschi Kreinick <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: t hepudds <[email protected]>
Change https://go.dev/cl/520116 mentions this issue: |
It looks like https://go.dev/cl/520268 might have been the first real GitHub PR where the new rules were triggred. The original commit message (on patch set 1) was:
The findings posted to the CL were:
From the timestamps, it looks the the contributor fixed those two problems within ~3 minutes, and then within ~10 minutes GerritBot imported the update back into Gerrit. The update commit message was:
...which addresses the two findings that were reported. |
Thanks for that analysis @thepudds. I want to comment on the "without a period" contributing to being a possible problem. It seems better to me to consider the period or no period as fine. That is, both https://go.dev/wiki/CommitMessage includes:
I usually include periods after "For #nnn." because I consider them to be short sentences. A recent CL by Russ https://go.dev/cl/506415 is another example that included a period in the Fixes line. |
…s comment Use the GitHub username (e.g., "thepudds") to greet the CL author when we report possible problems, rather than using their email. Updates golang/go#61573 Change-Id: I3699dc4e72c6244cd5d13493e5e4bd7cb098430a Reviewed-on: https://go-review.googlesource.com/c/build/+/520116 TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Heschi Kreinick <[email protected]> Auto-Submit: Dmitri Shuralyov <[email protected]> Run-TryBot: t hepudds <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
Hi @dmitshur, that makes sense. (That rule already allows different aliases like I'll circle back to this shortly to stop it from complaining about a trailing period. (I might stack it behind the corpus CL, in part to serve as an example of how the summary stats and whatnot change for the corpus after a small change). |
Also, before I did anything here, I skimmed a dozen or so random-ish CLs to see how reviews would stall, what seemed to be taking up human time, what types of review feedback could potentially be automated, and so on, and during that, I thought I saw ~1-2 times where someone on the core Go team asked someone to remove a trailing period from a In any event, to help make sure everyone is on the same page, I just tweaked the CommitMessage wiki page to say that a period is acceptable. Given this is policy-ish, please let me know if that's wrong, or correct my edit if needed. |
Change https://go.dev/cl/525976 mentions this issue: |
Change https://go.dev/cl/525975 mentions this issue: |
…after bug number Some contributors (including on the core Go team) prefer to end the bug number reference with a trailing period, like 'For #nnn.', so adjust the GitHub PR rules to allow that without complaint. The Contribution Guide does not use that style in its examples, but https://go.dev/wiki/CommitMessage was updated to mention it is allowed. Updates golang/go#61573 Change-Id: Idac4cef24bff22241dfc786684fe536def12f24f Reviewed-on: https://go-review.googlesource.com/c/build/+/525975 Reviewed-by: Dmitri Shuralyov <[email protected]> Reviewed-by: Heschi Kreinick <[email protected]> Auto-Submit: Dmitri Shuralyov <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
CL 521157 seems to be a recent example where a GitHub PR author did not log in to Gerrit after reading our new-ish automated GerritBot message, even though there are some hints that they might have wanted to reply but did not know to do so. (For example: they posted a question on the associated GitHub PR; after I manually gave them a link to the Gerrit login page, they then posted their first reply in Gerrit with a detailed question ~4 minutes after creating their Gerrit account; they previously had contributed a change that went through the entire process without them ever replying in Gerrit). Also, watching that interaction reminds us that automated or manual exhortations to click 'Done' are potentially confusing for someone who is not logged in (and hence does not see any 'Done' button at that moment). To better help new contributors such as this one, we make the automated rule violation message even more explicit about logging in, including supplying a link to the Gerrit login page again. We also switch to say "log in" rather than "register" to better evoke that it is not some heavyweight multi-step registration. (The Contribution Guide currently says "register" a few times as well as "create an account", but perhaps that could be adjusted in the future as well to sound a bit lighter weight). We also make the commit message editing advice more explicit, partly based on watching a different contributor in CL 520936 update the commit message via git in response to our prior automated advice. (In that CL, we then manually posted a dry run of our new advice, which did get them to edit the commit message successfully). While we are here, we attempt to make "I spotted" more likely to be visible when a contributor glances at a CL on their phone (by dropping our friendly hello), and correct the bug format example used when seeing an unknown repo. We also switch over to use go.dev/wiki instead of github.com/golang/go/wiki. Updates golang/go#61573 Change-Id: I5da57f093d575e260c2afdfcd8eff3df92c0a9d3 Reviewed-on: https://go-review.googlesource.com/c/build/+/525976 Auto-Submit: Heschi Kreinick <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Commit-Queue: t hepudds <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> Reviewed-by: Heschi Kreinick <[email protected]> Run-TryBot: t hepudds <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
Change https://go.dev/cl/530736 mentions this issue: |
CL 525976 included an update of several messages to use go.dev/wiki shortlinks instead of the equivalent github.com/golang/go/wiki links, including to be more future-proof given the wiki content is likely to move in the near future. This included updating the URL used in the sync messages posted on the GitHub PR to notify a contributor of a new Gerrit comment. This in turn caused GerritBot to start re-posting what were effectively duplicate sync comments to various open GitHub PRs. This is because GerritBot's sync of Gerrit comments to the GitHub PR is nicely simple, and currently relies on historical sync messages on the PR exactly matching what the current incarnation of GerritBot would have posted for old Gerrit comments. If the sync message content changes, GerritBot thinks it has never posted about even old Gerrit comments because it cannot find an exact match on the PR, and hence posts the new sync message content about old Gerrit comments. This CL restores the content of the sync messages, and adds a comment to help future contributors. The other messages modified in CL 525976 were not affected in the same way because they are posted in reaction to an event of creating a new GitHub PR or an update to the PR. Those other messages have also recently been modified by other CLs without triggering reposts. CL 525976 was noticed as misbehaving within ~10 minutes of being deployed. Also within ~10 minutes of being deployed, GerritBot encountered a 403 error and stopped posting duplicates: 403 You have exceeded a secondary rate limit and have been temporarily blocked from content creation The deployment was rolled back ~10 minutes after that. Updates golang/go#61573 Change-Id: Ibc5d299494baaa26118916799eaed023e5ccb26c Reviewed-on: https://go-review.googlesource.com/c/build/+/530736 Reviewed-by: Heschi Kreinick <[email protected]> Auto-Submit: Heschi Kreinick <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
Change https://go.dev/cl/591455 mentions this issue: |
…ches This change is intended to avoid reporting suggestions that are intended for development branches, like: The commit title should start with the primary affected package name followed by a colon, like "net/http: improve [...]" On backport CLs on release and internal branches, such as: "[release-branch.go1.22] os/exec: on Windows look for [...]" Since their possible mistakes don't quite follow the same rules. For golang/go#61573. Change-Id: I36c2317a999768eb9cfce6412278d0ed1c82da72 GitHub-Last-Rev: d4f6ba3 GitHub-Pull-Request: #97 Reviewed-on: https://go-review.googlesource.com/c/build/+/591455 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Michael Pratt <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> Auto-Submit: Dmitri Shuralyov <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]>
#61182 starts with an enumeration of friction for the GitHub PR workflow.
We could reduce some friction by automatically flagging common PR problems to save maintainer time and give faster feedback to a potential contributor.
If a maintainer sees that someone who sent a GitHub PR has responded properly in Gerrit to automated feedback, that is a signal that more detailed feedback from a human reviewer is more likely to be acted upon, which might help a maintainer triage which CLs get their attention when. (One of the worst issues listed in #61182 is that for some percent of GitHub PRs, the author never responds to feedback in Gerrit).
I am about to send a WIP CL for discussion. It started as a way to flag improper word wrapping (#24832), but once we can flag one problem, it was a small-ish step to add a loop, define a simple rule func signature, and then define a set of small rules.
Current rule names
Most of those rule definitions are small -- the median rule body is 5 lines of code (an if statement + 2 lines).
We could add more rules in the future, such as flagging raw benchmark output (suggesting benchstat), asking about the seeming absence of a test, flagging an incorrectly formatted CL reference, possibly checking gofmt, and so on.
Sample results
Running those rules against a corpus of 1000 recent CLs that started as GitHub PRs gives the following stats:
That corpus is the 1000 most recent PRs that were ultimately merged. The results here are for the commit message in the first patch set (prior to any human feedback).
Philosophy
The rules mostly err on the side of reducing false positives, though some are unavoidable. We attempt to mitigate false positives by stating in the message we post to the CL that the findings are "potential problems" based on "simple heuristics", and we ask the contributor to reply on the CL if a finding appears wrong. Part of the intent is to engage a first time contributor and get them using Gerrit, but without annoying them too much.
We also attempt to point towards auxiliary info and advice specific to the rule violation. If a rule triggers for a potential problem related to a commit message, we also include links to the contributing guide Good commit messages section and instructions on how to edit the commit message. (A current problem is new contributors don't always know how to edit the commit message after a human asks them to change it, which leads to extra review iterations).
There is also some simple logic to be more lenient for PRs that appear to be trivial changes (e.g., don't complain about a missing bug reference if the title contains things like "typo" or "grammar"). Where we have less certainty (e.g., identifying markdown use), we attempt to express that uncertainty in the resulting message shown to the user. The rule definition also allows us to skip certain rules for some repos (e.g., don't complain in the proposal repo about a missing package).
CC @heschi, @cagedmantis
The text was updated successfully, but these errors were encountered: