Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Fixes PoV over-estimation #13766

Merged
merged 11 commits into from
Apr 13, 2023
Merged

Fixes PoV over-estimation #13766

merged 11 commits into from
Apr 13, 2023

Conversation

ggwpez
Copy link
Member

@ggwpez ggwpez commented Mar 30, 2023

Closes #13765

(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)

Yep, this is in here as follow-up #11637. Currently we dont have all the data, so it tries to extrapolate.

TODOs:

  • Update weights (running on bm2 right now)

ggwpez added 5 commits March 30, 2023 13:59
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
@ggwpez ggwpez added the A3-in_progress Pull request is in progress. No review needed at this stage. label Mar 30, 2023
Copy link
Contributor

@agryaznov agryaznov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fast-fix!
This should indeed remove excessive over-estimation in its current state.

(This is almost right but not totally right, because the values we're making regression analysis on are not per-key benchmark results but originated from benchmark results for the all the keys and then adjusted for a single key, see below for details)

Yep, this is in here as follow-up #11637. Currently we dont have all the data, so it tries to extrapolate.

Taking just maximum over all keys is still imprecise because it counts for the pov_overhead added by a single key (okay maybe from two: one for base other for slope), and doesn't count for those from all other keys. That being said, the number of key reads most likely correlates with complexity parameter(s), hence to make a proper approximation we'd need to benchmark per-key, which is currently not implemented. Is that right?

@ggwpez
Copy link
Member Author

ggwpez commented Mar 30, 2023

Taking just maximum over all keys is still imprecise because it counts for the pov_overhead added by a single key (okay maybe from two: one for base other for slope), and doesn't count for those from all other keys. That being said, the number of key reads most likely correlates with complexity parameter(s), hence to make a proper approximation we'd need to benchmark per-key, which is currently not implemented. Is that right?

I dont quite get how you mean the first part. We have to count overhead of every key once. So if two keys are read, then we have to count the overhead twice since in the worst case these keys dont share any common nodes in the proof.

About the second part: yes. Currently we just measure the proof size of the whole thing (per component). It would be better to reset the proof recorder after every key access to track the size of each key. But in this case we would only tweak the result downward, hence why I did not prioritize it.
In the end we want to know how often each key is read in correlation to each complexity parameter. So we need a map of proof sizes like (KEY, PARAM) => POV-SIZE.

@agryaznov
Copy link
Contributor

We have to count overhead of every key once. So if two keys are read, then we have to count the overhead twice since in the worst case these keys dont share any common nodes in the proof.

This is not what happening in the current implementation. We currently calculate the overhead for a key for each benchmark, add it to proof_size of each benchmark, then approximate over those results ending up in base and slope for a single key. Then repeat this for all other keys. Therefore we end up having maximum base among all keys, but it could be larger if it had counted overhead for more than single key. But it hadn't, it counts overhead only for a single key. Same for slopes.

@ggwpez
Copy link
Member Author

ggwpez commented Apr 6, 2023

bot bench $ all

@command-bot
Copy link

command-bot bot commented Apr 6, 2023

@ggwpez https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/2652740 was started for your command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" all. Check out https://gitlab.parity.io/parity/mirrors/substrate/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment bot cancel 9-8930a859-91da-4b23-8637-e7fbcfa9125e to cancel this command or bot cancel to cancel all commands in this pull request.

@command-bot
Copy link

command-bot bot commented Apr 6, 2023

@ggwpez Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" all has finished. Result: https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/2652740 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/2652740/artifacts/download.

Signed-off-by: Oliver Tale-Yazdi <[email protected]>
@ggwpez ggwpez requested a review from a team April 11, 2023 09:52
@ggwpez ggwpez requested a review from athei as a code owner April 11, 2023 09:52
@ggwpez
Copy link
Member Author

ggwpez commented Apr 11, 2023

@agryaznov the proof size comparison is here.
And downloaded as PDF SWC-Web.pdf.
The values all went down, also the most offending function now has one byte per c:
image

@ggwpez ggwpez added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Apr 11, 2023
@agryaznov
Copy link
Contributor

The values all went down, also the most offending function now has one byte per c: image

Yes, this x5 proof_size weight was the origin why we had to look into benchmarks-cli closer. Now it looks sane, thanks!

Copy link
Contributor

@agryaznov agryaznov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay as a temp hotfix until https://github.com/paritytech/substrate/issues/13808 is done.

ggwpez added 3 commits April 12, 2023 17:01
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Fucked up the merge for this file...

Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
@ggwpez
Copy link
Member Author

ggwpez commented Apr 13, 2023

I had to re-bench the contracts pallet since there was some sr25519 function added, but looks fine.
image

bot merge

@paritytech-processbot paritytech-processbot bot merged commit be58e48 into master Apr 13, 2023
@paritytech-processbot paritytech-processbot bot deleted the oty-fix-pov-bench branch April 13, 2023 13:12
nathanwhit pushed a commit to nathanwhit/substrate that referenced this pull request Jul 19, 2023
* Align log

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Use max instead of sum

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Make comment ordering deterministic

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Dont add Pov overhead when all is ignored

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Update test pallet weights

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Re-run weights on bm2

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Fix test

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Actually use new weights

Fucked up the merge for this file...

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

* Update contract weights

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

---------

Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
Development

Successfully merging this pull request may close these issues.

[benchmark-cli] Wrong proof_size calculation leading to exaggerated weights
3 participants