Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.2.0: missed attestations due to insufficient peers #4660

Closed
twoeths opened this issue Oct 16, 2022 · 10 comments
Closed

v1.2.0: missed attestations due to insufficient peers #4660

twoeths opened this issue Oct 16, 2022 · 10 comments
Assignees
Labels
prio-high Resolve issues as soon as possible. scope-networking All issues related to networking, gossip, and libp2p.
Milestone

Comments

@twoeths
Copy link
Contributor

twoeths commented Oct 16, 2022

Describe the bug

After we deploy v1.2.0-rc.1, there is a missed attestation due to insufficient peers:

validator-2022-10-15.log:7224:Oct-15 03:57:26.860[]                error: Error publishing attestations slot=4915185 Internal Server Error: PublishError.InsufficientPeers

The sent peers when we publish attestations are actually down compared to v1.1.0

Screen Shot 2022-10-16 at 17 01 22

Expected behavior

  • Investigate why the sent peers are so low
  • It should be the same to v1.1.0
@twoeths
Copy link
Contributor Author

twoeths commented Oct 16, 2022

Number of topic peers are actually dropped.

  • on a mainnet node with v1.1.0 then v1.2.0.rc-1 (8 validators)

Screen Shot 2022-10-16 at 17 07 42

  • on a node with v1.1.0 only (8 validators)

Screen Shot 2022-10-16 at 17 08 17

@twoeths twoeths changed the title v1.2.0: Low sent peers count v1.2.0: missed attestations due to inefficient peers Oct 17, 2022
@twoeths
Copy link
Contributor Author

twoeths commented Oct 17, 2022

"Outgoing request error rate" is so high with v1.2.0

Screen Shot 2022-10-17 at 14 24 32

two of them are legacy errors like "stream ended before 1 bytes became available" or "REQUEST_ERROR_EMPTY_RESPONSE"

~/beacon$ grep -e "stream ended before 1 bytes became available\|REQUEST_ERROR_EMPTY_RESPONSE" -rn beacon-2022-10-15.log | wc -l
26168
~/beacon$ grep -e "stream ended before 1 bytes became available\|REQUEST_ERROR_EMPTY_RESPONSE" -rn beacon-2022-10-16.log | wc -l
~/beacon$ grep -e "stream ended before 1 bytes became available\|REQUEST_ERROR_EMPTY_RESPONSE" -rn beacon-2022-10-14.log | wc -l
29500
~/beacon$ grep -e "stream ended before 1 bytes became available\|REQUEST_ERROR_EMPTY_RESPONSE" -rn beacon-2022-10-12.log | wc -l
288
~/beacon$ grep -e "stream ended before 1 bytes became available\|REQUEST_ERROR_EMPTY_RESPONSE" -rn beacon-2022-10-11.log | wc -l
746

@twoeths
Copy link
Contributor Author

twoeths commented Oct 17, 2022

since lodestar peer scores are low, we send goodbye to a lot of peers

Screen Shot 2022-10-17 at 14 40 50

@philknows philknows added prio-high Resolve issues as soon as possible. scope-networking All issues related to networking, gossip, and libp2p. labels Oct 17, 2022
@twoeths twoeths changed the title v1.2.0: missed attestations due to inefficient peers v1.2.0: missed attestations due to insufficient peers Oct 18, 2022
@twoeths
Copy link
Contributor Author

twoeths commented Oct 18, 2022

we also received a lot of goodbye requests for different reasons

Screen Shot 2022-10-18 at 09 21 10

@twoeths
Copy link
Contributor Author

twoeths commented Oct 18, 2022

our I/O was super busy, other nodes disconnected us due to the timeout and we got penalties

Screen Shot 2022-10-18 at 20 33 17

@twoeths
Copy link
Contributor Author

twoeths commented Nov 9, 2022

it's still an issue in v1.2.0:

Screen Shot 2022-11-09 at 15 18 36

@philknows philknows added this to the v1.3.0 milestone Nov 9, 2022
@twoeths
Copy link
Contributor Author

twoeths commented Nov 11, 2022

haven't seen this issue since we deploy v1.2.0 (at least in the 39-validator node). Also the number of topic peers (sent peers) grew up after the 1st 2 days

Screen Shot 2022-11-11 at 08 51 04

@twoeths twoeths closed this as completed Nov 11, 2022
@nazarhussain
Copy link
Contributor

I am seeing lot of these error messages on the unstable branch during the sim tests.

Eph 4/4 1.635[node-2-cl-lodestar/api] �[31merror�[39m: Error on submitPoolAttestations [0] slot=36, index=2 PublishError.InsufficientPeers
Error: PublishError.InsufficientPeers
    at EventTarget.publish (file:///home/runner/work/lodestar/lodestar/node_modules/@chainsafe/libp2p-gossipsub/src/index.ts:1957:13)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at EventTarget.publishObject (file:///home/runner/work/lodestar/lodestar/packages/beacon-node/src/network/gossip/gossipsub.ts:171:20)
    at EventTarget.publishBeaconAttestation (file:///home/runner/work/lodestar/lodestar/packages/beacon-node/src/network/gossip/gossipsub.ts:213:12)
    at file:///home/runner/work/lodestar/lodestar/packages/beacon-node/src/api/impl/beacon/pool/index.ts:52:31
    at async Promise.all (index 0)
    at Object.submitPoolAttestations (file:///home/runner/work/lodestar/lodestar/packages/beacon-node/src/api/impl/beacon/pool/index.ts:46:7)
    at Object.handler (file:///home/runner/work/lodestar/lodestar/packages/api/src/utils/server/genericJsonServer.ts:41:23)

@dapplion
Copy link
Contributor

@tuyennhv Would be good to take a look at the sim tests run to understand why network health is bad

@twoeths
Copy link
Contributor Author

twoeths commented Nov 14, 2022

@nazarhussain @dapplion I opened #4764

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio-high Resolve issues as soon as possible. scope-networking All issues related to networking, gossip, and libp2p.
Projects
None yet
Development

No branches or pull requests

4 participants