Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prioritizePeers: ensure to prune to target peers #5217

Merged
merged 2 commits into from
Mar 3, 2023

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Feb 28, 2023

Motivation

  • In rare cases we'll have all low long lived subnets peers, they all have good score and don't group to a subnet, in that case lodestar does not prune any peers
  • This causes libp2p not to accept any new connections and we're not able to improve peers' long lived subnets
  • This causes missed attestations with "InsufficientPeers" error, see Ensure to prune to target peers #5198 (comment)

Description

  • Ensure to always prune to target peers
    • change peerHasDuty to dutiesByPeer: count dutied subnets of peer (modified)
    • peersEligibleForPruning: sort peers based on dutied subnets then long lived subnets then score (modified)
    • prune no long lived subnets (no change)
    • prune low score peers (no change)
    • prune peers that are too grouped to a subnet (no change)
    • prune more peers if there are still more peers to delete (new)
      • based on the sorted peers above
      • don't care peersEligibleForPruning since we want to ensure having exactly targetPeers

Closes #5198

TODO

@github-actions
Copy link
Contributor

github-actions bot commented Feb 28, 2023

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: cd70d45 Previous: e98d8e9 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 590.20 us/op 753.46 us/op 0.78
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 54.474 us/op 42.766 us/op 1.27
BLS verify - blst-native 1.2766 ms/op 1.1489 ms/op 1.11
BLS verifyMultipleSignatures 3 - blst-native 2.6066 ms/op 2.3387 ms/op 1.11
BLS verifyMultipleSignatures 8 - blst-native 5.6743 ms/op 5.0125 ms/op 1.13
BLS verifyMultipleSignatures 32 - blst-native 19.778 ms/op 18.154 ms/op 1.09
BLS aggregatePubkeys 32 - blst-native 27.282 us/op 24.455 us/op 1.12
BLS aggregatePubkeys 128 - blst-native 104.87 us/op 100.32 us/op 1.05
getAttestationsForBlock 80.361 ms/op 52.030 ms/op 1.54
isKnown best case - 1 super set check 306.00 ns/op 267.00 ns/op 1.15
isKnown normal case - 2 super set checks 299.00 ns/op 251.00 ns/op 1.19
isKnown worse case - 16 super set checks 294.00 ns/op 251.00 ns/op 1.17
CheckpointStateCache - add get delete 6.5740 us/op 5.0790 us/op 1.29
validate gossip signedAggregateAndProof - struct 2.8937 ms/op 2.7351 ms/op 1.06
validate gossip attestation - struct 1.3837 ms/op 1.3104 ms/op 1.06
pickEth1Vote - no votes 1.5449 ms/op 1.2925 ms/op 1.20
pickEth1Vote - max votes 15.210 ms/op 9.3112 ms/op 1.63
pickEth1Vote - Eth1Data hashTreeRoot value x2048 10.280 ms/op 8.5506 ms/op 1.20
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 18.552 ms/op 14.632 ms/op 1.27
pickEth1Vote - Eth1Data fastSerialize value x2048 875.20 us/op 646.63 us/op 1.35
pickEth1Vote - Eth1Data fastSerialize tree x2048 8.9870 ms/op 6.9295 ms/op 1.30
bytes32 toHexString 818.00 ns/op 475.00 ns/op 1.72
bytes32 Buffer.toString(hex) 449.00 ns/op 328.00 ns/op 1.37
bytes32 Buffer.toString(hex) from Uint8Array 686.00 ns/op 529.00 ns/op 1.30
bytes32 Buffer.toString(hex) + 0x 477.00 ns/op 324.00 ns/op 1.47
Object access 1 prop 0.22200 ns/op 0.15700 ns/op 1.41
Map access 1 prop 0.19500 ns/op 0.14800 ns/op 1.32
Object get x1000 7.5710 ns/op 6.2980 ns/op 1.20
Map get x1000 0.62500 ns/op 0.59100 ns/op 1.06
Object set x1000 73.679 ns/op 49.460 ns/op 1.49
Map set x1000 56.918 ns/op 40.899 ns/op 1.39
Return object 10000 times 0.24930 ns/op 0.22920 ns/op 1.09
Throw Error 10000 times 4.3107 us/op 3.9771 us/op 1.08
fastMsgIdFn sha256 / 200 bytes 3.6820 us/op 3.2710 us/op 1.13
fastMsgIdFn h32 xxhash / 200 bytes 334.00 ns/op 271.00 ns/op 1.23
fastMsgIdFn h64 xxhash / 200 bytes 490.00 ns/op 365.00 ns/op 1.34
fastMsgIdFn sha256 / 1000 bytes 12.210 us/op 11.216 us/op 1.09
fastMsgIdFn h32 xxhash / 1000 bytes 465.00 ns/op 398.00 ns/op 1.17
fastMsgIdFn h64 xxhash / 1000 bytes 559.00 ns/op 438.00 ns/op 1.28
fastMsgIdFn sha256 / 10000 bytes 107.23 us/op 100.72 us/op 1.06
fastMsgIdFn h32 xxhash / 10000 bytes 2.0330 us/op 1.8580 us/op 1.09
fastMsgIdFn h64 xxhash / 10000 bytes 1.4950 us/op 1.3180 us/op 1.13
enrSubnets - fastDeserialize 64 bits 1.7540 us/op 1.2410 us/op 1.41
enrSubnets - ssz BitVector 64 bits 639.00 ns/op 471.00 ns/op 1.36
enrSubnets - fastDeserialize 4 bits 231.00 ns/op 166.00 ns/op 1.39
enrSubnets - ssz BitVector 4 bits 655.00 ns/op 472.00 ns/op 1.39
prioritizePeers score -10:0 att 32-0.1 sync 2-0 131.46 us/op 97.154 us/op 1.35
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 166.74 us/op 121.27 us/op 1.37
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 209.18 us/op 170.61 us/op 1.23
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 374.14 us/op 303.70 us/op 1.23
prioritizePeers score 0:0 att 64-1 sync 4-1 451.56 us/op 365.47 us/op 1.24
array of 16000 items push then shift 1.8075 us/op 1.6120 us/op 1.12
LinkedList of 16000 items push then shift 10.538 ns/op 8.8270 ns/op 1.19
array of 16000 items push then pop 129.83 ns/op 72.764 ns/op 1.78
LinkedList of 16000 items push then pop 10.038 ns/op 8.5720 ns/op 1.17
array of 24000 items push then shift 2.4977 us/op 2.2301 us/op 1.12
LinkedList of 24000 items push then shift 9.8630 ns/op 8.7860 ns/op 1.12
array of 24000 items push then pop 86.100 ns/op 72.230 ns/op 1.19
LinkedList of 24000 items push then pop 9.0430 ns/op 8.0330 ns/op 1.13
intersect bitArray bitLen 8 13.749 ns/op 12.637 ns/op 1.09
intersect array and set length 8 84.011 ns/op 73.378 ns/op 1.14
intersect bitArray bitLen 128 45.182 ns/op 41.953 ns/op 1.08
intersect array and set length 128 1.2737 us/op 1.0048 us/op 1.27
Buffer.concat 32 items 3.1680 us/op 2.8370 us/op 1.12
Uint8Array.set 32 items 2.3550 us/op 2.3860 us/op 0.99
pass gossip attestations to forkchoice per slot 2.5394 ms/op 2.3172 ms/op 1.10
computeDeltas 3.2121 ms/op 3.4715 ms/op 0.93
computeProposerBoostScoreFromBalances 1.8783 ms/op 1.7831 ms/op 1.05
altair processAttestation - 250000 vs - 7PWei normalcase 4.1711 ms/op 2.1046 ms/op 1.98
altair processAttestation - 250000 vs - 7PWei worstcase 5.3292 ms/op 3.2812 ms/op 1.62
altair processAttestation - setStatus - 1/6 committees join 161.01 us/op 137.36 us/op 1.17
altair processAttestation - setStatus - 1/3 committees join 319.03 us/op 267.17 us/op 1.19
altair processAttestation - setStatus - 1/2 committees join 389.30 us/op 358.37 us/op 1.09
altair processAttestation - setStatus - 2/3 committees join 483.20 us/op 453.84 us/op 1.06
altair processAttestation - setStatus - 4/5 committees join 701.04 us/op 646.00 us/op 1.09
altair processAttestation - setStatus - 100% committees join 817.44 us/op 761.09 us/op 1.07
altair processBlock - 250000 vs - 7PWei normalcase 22.316 ms/op 18.713 ms/op 1.19
altair processBlock - 250000 vs - 7PWei normalcase hashState 28.960 ms/op 25.798 ms/op 1.12
altair processBlock - 250000 vs - 7PWei worstcase 55.515 ms/op 54.210 ms/op 1.02
altair processBlock - 250000 vs - 7PWei worstcase hashState 76.298 ms/op 66.407 ms/op 1.15
phase0 processBlock - 250000 vs - 7PWei normalcase 2.8365 ms/op 2.0918 ms/op 1.36
phase0 processBlock - 250000 vs - 7PWei worstcase 34.301 ms/op 28.028 ms/op 1.22
altair processEth1Data - 250000 vs - 7PWei normalcase 806.10 us/op 449.83 us/op 1.79
vc - 250000 eb 1 eth1 1 we 0 wn 0 - smpl 15 10.910 us/op 6.7190 us/op 1.62
vc - 250000 eb 0.95 eth1 0.1 we 0.05 wn 0 - smpl 219 31.900 us/op 19.707 us/op 1.62
vc - 250000 eb 0.95 eth1 0.3 we 0.05 wn 0 - smpl 42 15.008 us/op 8.4620 us/op 1.77
vc - 250000 eb 0.95 eth1 0.7 we 0.05 wn 0 - smpl 18 12.368 us/op 6.4720 us/op 1.91
vc - 250000 eb 0.1 eth1 0.1 we 0 wn 0 - smpl 1020 131.59 us/op 74.985 us/op 1.75
vc - 250000 eb 0.03 eth1 0.03 we 0 wn 0 - smpl 11777 807.60 us/op 619.85 us/op 1.30
vc - 250000 eb 0.01 eth1 0.01 we 0 wn 0 - smpl 16384 948.66 us/op 890.55 us/op 1.07
vc - 250000 eb 0 eth1 0 we 0 wn 0 - smpl 16384 992.33 us/op 869.00 us/op 1.14
vc - 250000 eb 0 eth1 0 we 0 wn 0 nocache - smpl 16384 2.4860 ms/op 2.2908 ms/op 1.09
vc - 250000 eb 0 eth1 1 we 0 wn 0 - smpl 16384 1.8222 ms/op 1.7151 ms/op 1.06
vc - 250000 eb 0 eth1 1 we 0 wn 0 nocache - smpl 16384 4.2680 ms/op 3.8801 ms/op 1.10
Tree 40 250000 create 352.20 ms/op 294.73 ms/op 1.19
Tree 40 250000 get(125000) 209.29 ns/op 169.86 ns/op 1.23
Tree 40 250000 set(125000) 1.4177 us/op 892.91 ns/op 1.59
Tree 40 250000 toArray() 22.055 ms/op 16.725 ms/op 1.32
Tree 40 250000 iterate all - toArray() + loop 22.239 ms/op 16.499 ms/op 1.35
Tree 40 250000 iterate all - get(i) 76.828 ms/op 64.422 ms/op 1.19
MutableVector 250000 create 11.640 ms/op 9.3590 ms/op 1.24
MutableVector 250000 get(125000) 6.9110 ns/op 5.9970 ns/op 1.15
MutableVector 250000 set(125000) 270.51 ns/op 233.81 ns/op 1.16
MutableVector 250000 toArray() 3.6981 ms/op 2.5739 ms/op 1.44
MutableVector 250000 iterate all - toArray() + loop 3.9153 ms/op 2.6702 ms/op 1.47
MutableVector 250000 iterate all - get(i) 1.5730 ms/op 1.4444 ms/op 1.09
Array 250000 create 3.5080 ms/op 2.3960 ms/op 1.46
Array 250000 clone - spread 1.2896 ms/op 1.1177 ms/op 1.15
Array 250000 get(125000) 0.63500 ns/op 0.52900 ns/op 1.20
Array 250000 set(125000) 0.73400 ns/op 0.61800 ns/op 1.19
Array 250000 iterate all - loop 114.38 us/op 101.69 us/op 1.12
effectiveBalanceIncrements clone Uint8Array 300000 40.285 us/op 27.626 us/op 1.46
effectiveBalanceIncrements clone MutableVector 300000 407.00 ns/op 365.00 ns/op 1.12
effectiveBalanceIncrements rw all Uint8Array 300000 172.42 us/op 167.47 us/op 1.03
effectiveBalanceIncrements rw all MutableVector 300000 97.874 ms/op 80.513 ms/op 1.22
phase0 afterProcessEpoch - 250000 vs - 7PWei 120.94 ms/op 111.48 ms/op 1.08
phase0 beforeProcessEpoch - 250000 vs - 7PWei 44.711 ms/op 32.354 ms/op 1.38
altair processEpoch - mainnet_e81889 363.32 ms/op 326.78 ms/op 1.11
mainnet_e81889 - altair beforeProcessEpoch 77.202 ms/op 64.953 ms/op 1.19
mainnet_e81889 - altair processJustificationAndFinalization 23.141 us/op 18.120 us/op 1.28
mainnet_e81889 - altair processInactivityUpdates 6.6389 ms/op 5.2571 ms/op 1.26
mainnet_e81889 - altair processRewardsAndPenalties 55.888 ms/op 49.621 ms/op 1.13
mainnet_e81889 - altair processRegistryUpdates 4.1120 us/op 2.7820 us/op 1.48
mainnet_e81889 - altair processSlashings 696.00 ns/op 593.00 ns/op 1.17
mainnet_e81889 - altair processEth1DataReset 958.00 ns/op 560.00 ns/op 1.71
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.3679 ms/op 1.2161 ms/op 1.12
mainnet_e81889 - altair processSlashingsReset 12.047 us/op 4.3980 us/op 2.74
mainnet_e81889 - altair processRandaoMixesReset 8.3590 us/op 4.2860 us/op 1.95
mainnet_e81889 - altair processHistoricalRootsUpdate 1.1540 us/op 688.00 ns/op 1.68
mainnet_e81889 - altair processParticipationFlagUpdates 3.4530 us/op 2.7800 us/op 1.24
mainnet_e81889 - altair processSyncCommitteeUpdates 929.00 ns/op 451.00 ns/op 2.06
mainnet_e81889 - altair afterProcessEpoch 140.43 ms/op 126.64 ms/op 1.11
phase0 processEpoch - mainnet_e58758 385.26 ms/op 361.87 ms/op 1.06
mainnet_e58758 - phase0 beforeProcessEpoch 160.68 ms/op 139.18 ms/op 1.15
mainnet_e58758 - phase0 processJustificationAndFinalization 24.592 us/op 17.011 us/op 1.45
mainnet_e58758 - phase0 processRewardsAndPenalties 74.543 ms/op 63.863 ms/op 1.17
mainnet_e58758 - phase0 processRegistryUpdates 12.395 us/op 7.5350 us/op 1.64
mainnet_e58758 - phase0 processSlashings 846.00 ns/op 473.00 ns/op 1.79
mainnet_e58758 - phase0 processEth1DataReset 1.0790 us/op 488.00 ns/op 2.21
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 3.9534 ms/op 1.0069 ms/op 3.93
mainnet_e58758 - phase0 processSlashingsReset 4.1010 us/op 3.7790 us/op 1.09
mainnet_e58758 - phase0 processRandaoMixesReset 8.6890 us/op 4.2450 us/op 2.05
mainnet_e58758 - phase0 processHistoricalRootsUpdate 941.00 ns/op 571.00 ns/op 1.65
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.3630 us/op 4.1960 us/op 1.04
mainnet_e58758 - phase0 afterProcessEpoch 102.56 ms/op 97.739 ms/op 1.05
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.2855 ms/op 1.2365 ms/op 1.04
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.5697 ms/op 1.4971 ms/op 1.05
altair processInactivityUpdates - 250000 normalcase 20.685 ms/op 25.513 ms/op 0.81
altair processInactivityUpdates - 250000 worstcase 20.535 ms/op 26.777 ms/op 0.77
phase0 processRegistryUpdates - 250000 normalcase 6.8510 us/op 6.8840 us/op 1.00
phase0 processRegistryUpdates - 250000 badcase_full_deposits 275.21 us/op 242.63 us/op 1.13
phase0 processRegistryUpdates - 250000 worstcase 0.5 131.50 ms/op 131.97 ms/op 1.00
altair processRewardsAndPenalties - 250000 normalcase 65.476 ms/op 69.117 ms/op 0.95
altair processRewardsAndPenalties - 250000 worstcase 71.173 ms/op 65.456 ms/op 1.09
phase0 getAttestationDeltas - 250000 normalcase 7.0551 ms/op 6.5601 ms/op 1.08
phase0 getAttestationDeltas - 250000 worstcase 6.9745 ms/op 6.6288 ms/op 1.05
phase0 processSlashings - 250000 worstcase 3.7041 ms/op 3.4712 ms/op 1.07
altair processSyncCommitteeUpdates - 250000 190.03 ms/op 175.71 ms/op 1.08
BeaconState.hashTreeRoot - No change 372.00 ns/op 367.00 ns/op 1.01
BeaconState.hashTreeRoot - 1 full validator 53.905 us/op 50.756 us/op 1.06
BeaconState.hashTreeRoot - 32 full validator 507.27 us/op 504.76 us/op 1.00
BeaconState.hashTreeRoot - 512 full validator 6.0460 ms/op 5.7373 ms/op 1.05
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 65.710 us/op 66.999 us/op 0.98
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 952.05 us/op 892.52 us/op 1.07
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 12.022 ms/op 11.572 ms/op 1.04
BeaconState.hashTreeRoot - 1 balances 52.720 us/op 51.709 us/op 1.02
BeaconState.hashTreeRoot - 32 balances 472.65 us/op 445.28 us/op 1.06
BeaconState.hashTreeRoot - 512 balances 4.6074 ms/op 4.5024 ms/op 1.02
BeaconState.hashTreeRoot - 250000 balances 74.576 ms/op 80.072 ms/op 0.93
aggregationBits - 2048 els - zipIndexesInBitList 16.712 us/op 15.050 us/op 1.11
regular array get 100000 times 35.967 us/op 32.956 us/op 1.09
wrappedArray get 100000 times 33.842 us/op 32.902 us/op 1.03
arrayWithProxy get 100000 times 16.293 ms/op 15.166 ms/op 1.07
ssz.Root.equals 565.00 ns/op 547.00 ns/op 1.03
byteArrayEquals 554.00 ns/op 540.00 ns/op 1.03
shuffle list - 16384 els 6.8859 ms/op 6.8054 ms/op 1.01
shuffle list - 250000 els 103.12 ms/op 100.32 ms/op 1.03
processSlot - 1 slots 9.3880 us/op 9.2240 us/op 1.02
processSlot - 32 slots 1.3809 ms/op 1.2822 ms/op 1.08
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 194.84 us/op 192.38 us/op 1.01
getCommitteeAssignments - req 1 vs - 250000 vc 2.9531 ms/op 2.9152 ms/op 1.01
getCommitteeAssignments - req 100 vs - 250000 vc 4.1590 ms/op 4.1676 ms/op 1.00
getCommitteeAssignments - req 1000 vs - 250000 vc 4.4820 ms/op 4.4766 ms/op 1.00
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.1100 ns/op 4.7800 ns/op 1.07
state getBlockRootAtSlot - 250000 vs - 7PWei 848.28 ns/op 702.36 ns/op 1.21
computeProposers - vc 250000 10.800 ms/op 10.264 ms/op 1.05
computeEpochShuffling - vc 250000 103.70 ms/op 100.95 ms/op 1.03
getNextSyncCommittee - vc 250000 187.09 ms/op 166.68 ms/op 1.12

by benchmarkbot/action

@dapplion
Copy link
Contributor

dapplion commented Mar 1, 2023

@tuyennhv can you rebase on latest unstable?

@philknows philknows added this to the v1.6.0 milestone Mar 1, 2023
@twoeths twoeths force-pushed the tuyen/prune_to_target_peers branch from 1a97712 to 1486489 Compare March 2, 2023 02:13
@twoeths
Copy link
Contributor Author

twoeths commented Mar 3, 2023

Metrics on feat1 is expected, peers are cut to 50 frequently

Screen Shot 2023-03-03 at 09 01 18

sometimes we disconnect peers with the new reason: "find better peers"

Screen Shot 2023-03-03 at 09 05 12

all other metrics are the same

@twoeths twoeths marked this pull request as ready for review March 3, 2023 02:07
@twoeths twoeths requested a review from a team as a code owner March 3, 2023 02:07
Copy link
Contributor

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Good point to add the distinct reason to track on metrics. Let's keep an eye on how often the nodes now get stuck with bad peers

@dapplion dapplion merged commit a4434f7 into unstable Mar 3, 2023
@dapplion dapplion deleted the tuyen/prune_to_target_peers branch March 3, 2023 02:18
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.6.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure to prune to target peers
4 participants