Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate: getState related all regenFns's metric show empty data plus getPreState regenFn job metrics #3120

Closed
g11tech opened this issue Sep 13, 2021 · 7 comments
Assignees
Labels
prio-medium Resolve this some time soon (tm). scope-metrics All issues with regards to the exposed metrics.

Comments

@g11tech
Copy link
Contributor

g11tech commented Sep 13, 2021

image

image

@g11tech g11tech self-assigned this Sep 13, 2021
@q9f q9f added mod4-api prio-medium Resolve this some time soon (tm). scope-metrics All issues with regards to the exposed metrics. and removed scopeb-ci labels Sep 14, 2021
@g11tech
Copy link
Contributor Author

g11tech commented Sep 18, 2021

on a cursory glance, seems like queued getState seems to be never called (which is our entrypoint for starting any regen metric).
however will there is a getState in api/impl/beacon/state/index.ts that further calls resolveStateId:

resolveStateId either picks from stateCache, or stateArchive or constructs from stateArchive.
seems like getState is not being called from outside.

@g11tech
Copy link
Contributor Author

g11tech commented Sep 18, 2021

seems like getPreState computation is never getting queued for computation and is getting resolved in cache only.
it does get called from processBlocksInEpoch but gets resolved using cache's only in regen/queued.ts and never further ahead.

here is an effort to look up any observation across last 30 days in a prater running node"
image

@g11tech
Copy link
Contributor Author

g11tech commented Sep 18, 2021

@dapplion should these metrics panels be removed from regenFn Stats of the metrics dashboard? or should I aggregate all these metrics in a single widget (multiple targets), someday if something shows up we can dig deeper
atleast we can remove the crowding from regenFn Stats

@dapplion
Copy link
Contributor

@dapplion should these metrics panels be removed from regenFn Stats of the metrics dashboard? or should I aggregate all these metrics in a single widget (multiple targets), someday if something shows up we can dig deeper
atleast we can remove the crowding from regenFn Stats

Sounds good do a PR with screen captures and let's see how it looks. If you can make all panels by 2 per row format would be great, there are still 2 full row width panels.

@g11tech
Copy link
Contributor Author

g11tech commented Sep 25, 2021

get preState graphs are now showing up when the node is experiencing the sync failure, which means these graphs can be retained,

image

will collapse only getState graphs (apart from also resizing the full width graphs into half width)

@dapplion
Copy link
Contributor

@g11tech Why is the cache hit ratio negative sometimes?

@g11tech
Copy link
Contributor Author

g11tech commented Sep 27, 2021

because its an approximation as we skipped counting the queued but are deriving it from queued processing. so assume at sec 1 one gets 10 requests, which gets queued, on second 2 queued length processes those 10, so delta(total) - delta(queued processed) creates these artifacts.

the better solution will be to count queued in the same function scope as where we count total so that there is no drift when we do delta(total) - delta (queued). let me know if you would want me to make this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio-medium Resolve this some time soon (tm). scope-metrics All issues with regards to the exposed metrics.
Projects
None yet
Development

No branches or pull requests

3 participants