Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly initialize metrics in rpcchainVM #3477

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

yacovm
Copy link
Contributor

@yacovm yacovm commented Oct 16, 2024

Why this should be merged

The RPC chain VM gRPC dispatcher initializes the metrics at Initialize() but it doesn't need any input from the Initialize method's parameters. It's better to initialize the metrics at the construction of the object.

This is needed because currently, if the gRPC service is requested to dispatch a gathering of the metrics, it might be that the metrics in the backend is yet to be initialized, and this is a problem.

How this works

Just a refactoring.

How this was tested

CI

Copy link
Collaborator

@aaronbuchwald aaronbuchwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one nit, then this LGTM

Comment on lines 89 to 100
pluginMetrics := metrics.NewPrefixGatherer()

bVM, _ := vm.(block.BuildBlockWithContextChainVM)
ssVM, _ := vm.(block.StateSyncableVM)
return &VMServer{
vmSrv := &VMServer{
vm: vm,
bVM: bVM,
ssVM: ssVM,
allowShutdown: allowShutdown,
}

vmSrv.metrics = pluginMetrics
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch to just inlining this for the smallest diff? ie. add metrics: metrics.NewPrefixGatherer() as the next line in the struct definition?

bVM, _ := vm.(block.BuildBlockWithContextChainVM)
ssVM, _ := vm.(block.StateSyncableVM)
return &VMServer{
vmSrv := &VMServer{
metrics: pluginMetrics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could just be metrics: metrics.NewPrefixGatherer()

@StephenButtolph StephenButtolph added this pull request to the merge queue Oct 16, 2024
Merged via the queue into ava-labs:master with commit d81292a Oct 16, 2024
23 checks passed
@aaronbuchwald
Copy link
Collaborator

This fixes a race condition while initializing an instance of the RPCChainVM that triggered a panic in AvalancheGo prior to this PR:

/go/pkg/mod/github.com/ava-labs/[email protected]/vms/rpcchainvm/vm_server.go:563 +0x1a

The path that triggered this bug is registering VMClient (reference held by the chain manager) as a gatherer here during vm.Initialize(...): https://github.com/ava-labs/avalanchego/blob/v1.11.11/vms/rpcchainvm/vm_client.go#L159

This is called before calling Initialize on the VM server with all of the required arguments from the vm client: https://github.com/ava-labs/avalanchego/blob/v1.11.11/vms/rpcchainvm/vm_client.go#L211. This is before the VM server would have initialized its own metrics field: https://github.com/ava-labs/avalanchego/blob/v1.11.11/vms/rpcchainvm/vm_server.go#L137.

This can trigger a panic because Gather can be called at any time, which can call Gather on the VMClient as soon as it registers itself and in the middle of vm.Initialize(...).

@StephenButtolph StephenButtolph added bug Something isn't working incident response labels Oct 23, 2024
@StephenButtolph StephenButtolph added this to the v1.11.13 milestone Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working incident response
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants