Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove logic that increases delay between progress updates over time #16221

Conversation

tom-cnops
Copy link
Contributor

@tom-cnops tom-cnops commented Sep 5, 2022

Removes a bit of undocumented behavior that increases the delay between progress updates the longer the build is running.
Potentially resolves #16119

Also renamed MAXIMAL_UPDATE_DELAY_MILLIS to MINIMAL_UPDATE_INTERVAL_MILLIS because it is used as a minimum and doesn't affect the delay between messages.

@tom-cnops
Copy link
Contributor Author

tom-cnops commented Sep 5, 2022

Regarding the build failures: it appears to me that the failing //src/test/java/com/google/devtools/build/lib/remote:remote doesn't touch the code I changed (it passess locally when I put System.exit(1) in the removed if-branch). But I don't have the exact system setup available for testing, so it would be great if someone more experienced could have a look.

Merging master seems to have resolved the failing test.

@sgowroji sgowroji added team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website awaiting-user-response Awaiting a response from the author labels Sep 6, 2022
@sgowroji sgowroji added awaiting-review PR is awaiting review from an assigned reviewer and removed awaiting-user-response Awaiting a response from the author labels Sep 8, 2022
@meteorcloudy meteorcloudy added team-Performance Issues for Performance teams and removed team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website labels Sep 20, 2022
@meisterT meisterT requested a review from zhengwei143 October 14, 2022 12:57
@larsrc-google
Copy link
Contributor

There was a good reason for the increasing time feature to be added in the first place:

"In the experimental UI, increase the rate limit for updates to the progress
bar over time, if it cannot be updated in place. In this way, we can get snappy
first progress descriptions, while not overwhelming the user with too many progress
messages."

Don't you think that reasoning still applies?

@tom-cnops
Copy link
Contributor Author

tom-cnops commented Oct 17, 2022

I read that as well, but I don't see how it still applies.

There now is --show_progress_rate_limit to set a minimum delay between updates and --progress_report_interval to set the scheduled interval between updates. With these settings, users can determine "snappy" and "amount of updates" respectively.

I don't see how secretly taking the floor of the specified progress interval and 15% of the runtime improves things. (I don't think anyone would feel overwhelmed if they ask for an update every 20 seconds and get an update every 20 seconds).

@larsrc-google
Copy link
Contributor

True, but the point here is that rather than the user having to figure out what flag to set to what value, it automatically notices the lack of cursor control and adapts in a way that gives more feedback at first, but tapers off for long builds. Your change is making the UI more manual.

@tom-cnops
Copy link
Contributor Author

If no flags are provided, the default is interval is 20 seconds. So for the first ~2 minutes (20/0.15 seconds), there is no difference in output with and without this change. So initial feedback is the same for the "default" case.

It is true that users who want fewer updates than the default for their long-running jobs would need to provide a flag where they didn't before.

However, I'd like to the point to the converse situation also: users who want to have more logging than the current implementation cannot even provide a flag to change the behavior, they have an incremental delay between messages no-matter what.

I'd say that the latter case is worse than the first case, also because I find it hard to think of many contexts where a log message every 20 seconds is acceptable at first, but disturbing later.

@zhengwei143
Copy link
Contributor

I'm more inclined to agree with Tom.

The current implementation takes this decision out of the hands of the user, and if this were really necessary, should be a separate flag.

I'll test and review this tomorrow.

@larsrc-google
Copy link
Contributor

With this change, the difference between progress_report_interval and show_progress_rate_limit becomes very subtle. I also think the documentation is quite confusing, and it certainly gives no clue to how the two interact. If you're going to rip out this functionality, please update the docs to reflect reality.

…ress-rate-increase-over-time

# Conflicts:
#	src/main/java/com/google/devtools/build/lib/runtime/UiEventHandler.java
@tom-cnops
Copy link
Contributor Author

When updating the docs I noticed that progress_report_interval is actually used by the ActionExecutionStatusReporter, and not by the UiEventHandler. The UiEventHandler used a hardcoded limit (1 second if there is cursor control, 20 seconds if there is no cursor control), to write progress messages.

This meant that the UiEventHandler and ActionExecutionStatusReporter were both causing updates to be written. This usually is usually no issue (even undectable if progress_report_interval is an integer multiple of 20), but running with --progress_report_interval=25 --curses=no writes progress reports at 20, 25, 45, 50, ... seconds. I think it makes more sense to not do time-based updates if there is no cursor control (the 1-second updates if there is cursor control are expected of course).

I've also tried to make the docs a bit more explicit.

@larsrc-google
Copy link
Contributor

It's even more confusing than that. The --progress_report_interval loop ignores (AFAICT) the curses-ness, but can update with 1-second intervals as long as there are finished actions to report on. If 0, it updates after 10/30/60 seconds when no actions finish.

@tom-cnops
Copy link
Contributor Author

Yes, the ActionExecutionStatusReporter is not impacted by curses-ness, so it will always fire progress events on the progress_report_interval (or 10/30/60/60 ... if it is 0).

In addition to that, the UiEventHandler has its own hardcoded interval (1s or 20s) to write time-based updates (which I propose to remove if there is no cursor control).

Also the UiEventHandler is rate limited (by show_progress_rate_limit), which suppresses all progress messages. This has always worked as expected I believe. This is one message per second if no curses.

The original reason for this PR is that minimalDelayMillis (the internal variable for show_progress_rate_limit) was floored with 15% of the runtime if there was no cursor control.

@tom-cnops tom-cnops requested review from zhengwei143 and removed request for gregestren and fweikert October 26, 2022 16:12
@zhengwei143 zhengwei143 added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Oct 27, 2022
@zhengwei143 zhengwei143 removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Oct 27, 2022
@zhengwei143
Copy link
Contributor

Thanks for the contribution @tom-cnops!

@brentleyjones
Copy link
Contributor

@bazel-io flag

@bazel-io bazel-io added the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Oct 27, 2022
@ShreeM01
Copy link
Contributor

@bazel-io fork 6.0.0

@bazel-io bazel-io removed the potential release blocker Flagged by community members using "@bazel-io flag". Should be added to a release blocker milestone label Oct 28, 2022
ShreeM01 added a commit that referenced this pull request Oct 28, 2022
…16606)

Removes a bit of undocumented behavior that increases the delay between progress updates the longer the build is running.
Potentially resolves #16119

Also renamed `MAXIMAL_UPDATE_DELAY_MILLIS` to `MINIMAL_UPDATE_INTERVAL_MILLIS` because it is used as a minimum and doesn't affect the delay between messages.

Fixes #16119.

Closes #16221.

RELNOTES[INC]: Bazel no longer increases the delay between progress updates when there is no cursor control.

PiperOrigin-RevId: 484203851
Change-Id: I33f7302ca85b75135744eb6093b618196199a49f

Co-authored-by: Tom Cnops <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Performance Issues for Performance teams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Progress updates become ever more scarce without terminal cursor control
8 participants