Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Create hostperfcheck subcommand #4196

Closed
drahnr opened this issue Nov 1, 2021 · 4 comments · Fixed by #4342
Closed

Create hostperfcheck subcommand #4196

drahnr opened this issue Nov 1, 2021 · 4 comments · Fixed by #4342
Assignees
Labels
J0-enhancement An additional feature request. T4-parachains_engineering This PR/Issue is related to Parachains performance, stability, maintenance.

Comments

@drahnr
Copy link
Contributor

drahnr commented Nov 1, 2021

Problem outline:

Currently there is no way of knowing if a machine is powerful enough to run a validator, especiall in the presence of disputes which cause additional load besides the regular block production scheme.

Pvf compilation is the most expensive component here, pvf execution can be guaranteed to not time out and as a consequence disputes not being voted on.

Proposal:

Add a subcommand that does execute a example wasm compilation and measures the time, such that validator operators know if their setup is performant enough.

Consideration:

Running this check by default on the first run and bail if insufficient might make sense.

@drahnr drahnr self-assigned this Nov 1, 2021
@drahnr
Copy link
Contributor Author

drahnr commented Nov 1, 2021

CC @ordian @sandreim

@drahnr drahnr added I7-documentation Documentation needs fixing, improving or augmenting. J0-enhancement An additional feature request. and removed I7-documentation Documentation needs fixing, improving or augmenting. labels Nov 1, 2021
@drahnr drahnr changed the title Create node_perf_check subcommand Create hostperfcheck subcommand Nov 1, 2021
@drahnr drahnr mentioned this issue Nov 1, 2021
9 tasks
@pepyakin
Copy link
Contributor

pepyakin commented Nov 4, 2021

I actually had a talk with @bkchr and we had an idea to notify the operator in case that there are too many PVF preparation failures.

I think that would be a better approach than a one off benchmark. Cloud VMs performance can change over time. For example, a heavy workload can be run in the same host or the VM is allowed to consume "boost credits" to accommodate bursts in the workload but after some amount of intensive work it will be throttled back.

@drahnr
Copy link
Contributor Author

drahnr commented Nov 4, 2021

Fair point, I still think it would be advantageous to get a green/orange/red indicator of the host performance at the first start or on explicit call, even if there are inaccuracies. Setting the bar higher than we'd expect to make it a green-light result could remedy operational pains.

I fully agree that we should also have a metric(?) to notify about repeated wasm compilation timeouts.

@pepyakin
Copy link
Contributor

pepyakin commented Nov 4, 2021

FWIW, there is a metric that measures preparation times (which came useful recently). The thing we were thinking about is emitting a warn message once (or every now and then, but not too often) that within recent period of time there were too many of those preparation timeouts.

@ordian ordian added the T4-parachains_engineering This PR/Issue is related to Parachains performance, stability, maintenance. label Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J0-enhancement An additional feature request. T4-parachains_engineering This PR/Issue is related to Parachains performance, stability, maintenance.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants