Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(validator): listen and get metadata only for committee validators #1787

Merged
merged 18 commits into from
Dec 17, 2024

Conversation

nkryuchkov
Copy link
Contributor

@nkryuchkov nkryuchkov commented Oct 10, 2024

Closes #1703

Note: each selfSubnets call takes approx. 300-900 µs

@nkryuchkov nkryuchkov requested a review from y0sher October 10, 2024 23:42
Copy link
Contributor

@y0sher y0sher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this pr is still immature. quoting the issue here to clarify my original intentions -

- Consider to pull metadata from beacon node only about validators that are part of your listened subnets.
- Reject/ignore all messages that are not part of your subnet validators (this might happen automatically since you won't have their status and duties)

So about first point I see you started it, but your using the ValidatorSubnet calculation which is pre-alan way to find your subnets. We now use CommiteeSubnet and have a different way of dividing our validators to subnets (according to committees).

About second point I don't see that you did anything about message validation, though if we don't have any metadata about this operator's validators than we might just naturally reject/ignore it? please make sure what happens in this case.

@nkryuchkov
Copy link
Contributor Author

nkryuchkov commented Oct 16, 2024

So about first point I see you started it, but your using the ValidatorSubnet calculation which is pre-alan way to find your subnets. We now use CommiteeSubnet and have a different way of dividing our validators to subnets (according to committees).

Added post-fork handling, please review again

About second point I don't see that you did anything about message validation, though if we don't have any metadata about this operator's validators than we might just naturally reject/ignore it? please make sure what happens in this case.

In this case the message gets ignored:

	ErrNoShareMetadata                         = Error{text: "share has no metadata"}

@nkryuchkov nkryuchkov requested a review from y0sher October 16, 2024 01:32
Copy link
Contributor

@y0sher y0sher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remove prefork code, as I explained its not needed anymore
  • to make this PR final we need to POC that it works on stage
  • Compare cpu mem usage
  • compare how this impacts the overload on beacon node
  • We probably need to ignore messages that we don't have duties or its validator we don't see as active (because they're not ours). instead of rejecting them.
  • @MatheusFranco99 can you chime in in terms of protocol and networking, other than more well spread messages, any reason why ignoring these message will impact us badly?

@MatheusFranco99
Copy link
Contributor

@MatheusFranco99 can you chime in in terms of protocol and networking, other than more well-spread messages, any reason why ignoring these messages will impact us badly?

@y0sher
Do you mean ignoring messages from validators that we think are liquidated / not active / exited or whatever similar? I think it's not a problem to ignore instead of reject. We also have ignoring protection against an attack so we should be fine.

But we must not do this if you meant ignoring messages about validators that we are not part of the committee.

@nkryuchkov nkryuchkov marked this pull request as ready for review October 16, 2024 23:42
@nkryuchkov
Copy link
Contributor Author

  • Remove prefork code, as I explained its not needed anymore

done

  • to make this PR final we need to POC that it works on stage

I'm still testing it but so far it seems to be working well

image
  • Compare cpu mem usage

It was deployed between 23:46 and 00:28 and since 01:13

CPU dropped from ~2 to ~0.4-0.5:

image

Memory might have dropped a little bit but the difference is not that huge, also it differs between runs:

image

Receive bandwidth dropped from ~1.3 to ~0.4-0.5, transmit bandwidth just a little bit:

image

Although, for another node the transmit bandwidth difference is significant:

image
  • compare how this impacts the overload on beacon node

I don't see any difference

CPU:

image

Mem:

image
  • We probably need to ignore messages that we don't have duties or its validator we don't see as active (because they're not ours). instead of rejecting them.

Well, we ignore messages if there's no metadata, so if they're not ours, we should have no metadata for them, hence the message is ignored

@nkryuchkov nkryuchkov requested a review from y0sher October 17, 2024 01:34
@MatheusFranco99
Copy link
Contributor

Well, we ignore messages if there's no metadata, so if they're not ours, we should have no metadata for them, hence the message is ignored

@nkryuchkov
When you say "if they're not ours", you mean non SSV validators, right? Or do you mean non-committee validators? Or validators from topics we're not subscribed to? 😅
I think that's the part I got a bit confused

Great job!

@nkryuchkov
Copy link
Contributor Author

@MatheusFranco99 sorry for the confusion, I meant validators from topics we're not subscribed to

@nkryuchkov nkryuchkov requested a review from MatusKysel October 17, 2024 12:38
Copy link
Contributor

@MatusKysel MatusKysel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

for _, share := range shares {
if c.operatorDataStore.GetOperatorID() != 0 && share.BelongsToOperator(c.operatorDataStore.GetOperatorID()) {
ownShares = append(ownShares, share)
}
allPubKeys = append(allPubKeys, share.ValidatorPubKey[:])
committeeSubnet := byte(commons.CommitteeSubnet(share.CommitteeID()))
if bytes.Contains(activeSubnets, []byte{committeeSubnet}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have SharedSubnets and some other subnets utility functions. its best to use a function across the code to make changing and bug fixing easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SharedSubnets checks intersections between two subnet sets but we have a single value here, so I think it's not a good fit here. However, there was a bug in my implementation, I don't need bytes.Contains, I can just check the index

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I think we need to refactor how we work with subnets in future PRs: Create a type for it and add methods

@@ -1139,6 +1148,12 @@ func (c *controller) UpdateValidatorMetaDataLoop() {
if share.Liquidated {
return true
}

committeeSubnet := byte(commons.CommitteeSubnet(share.CommitteeID()))
if !bytes.Contains(activeSubnets, []byte{committeeSubnet}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@moshe-blox moshe-blox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🔥

fixed a bug and pushed: ac49a96

@moshe-blox
Copy link
Contributor

moshe-blox commented Nov 17, 2024

@MatheusFranco99 can you chime in in terms of protocol and networking, other than more well-spread messages, any reason why ignoring these messages will impact us badly?

@y0sher Do you mean ignoring messages from validators that we think are liquidated / not active / exited or whatever similar? I think it's not a problem to ignore instead of reject. We also have ignoring protection against an attack so we should be fine.

But we must not do this if you meant ignoring messages about validators that we are not part of the committee.

@nkryuchkov @y0sher

is this the case now in message validation with this PR?

if we receive a message with committee X for validator that's actually in committee Y (where both aren't our committees), would we reject it for invalid committee or ignore for missing metadata?

which check happens first?

# Conflicts:
#	network/p2p/p2p_setup.go
#	operator/validator/mocks/controller.go
Copy link
Contributor

@y0sher y0sher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check latest comment. we need to make sure we always update all our validators' non liquidated shares

Copy link
Contributor

@y0sher y0sher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. can you please reply my question

@y0sher y0sher self-requested a review December 17, 2024 12:25
Copy link
Contributor

@y0sher y0sher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gj.
please lets check the resource change for calling selfSubnets for every iteration of the metadata loop before merging this.

Copy link
Contributor

@moshe-blox moshe-blox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@moshe-blox moshe-blox changed the title validator: listen and get metadata only for committee validators feat(validator): listen and get metadata only for committee validators Dec 17, 2024
@moshe-blox moshe-blox merged commit 9eee883 into stage Dec 17, 2024
5 checks passed
@moshe-blox moshe-blox deleted the metadata-only-committee-validators branch December 17, 2024 14:12
@olegshmuelov
Copy link
Contributor

It seems there's an issue with the current implementation of this PR. While the logic correctly focuses on listening and getting metadata only for committee validators, it does not account for the exporter node. The exporter node requires metadata for all validators, not just the committee ones.

This is a critical oversight, as it may cause incomplete data in the exporter node's operations. We need to update the logic to ensure that the exporter node retrieves metadata for all validators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants