Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use nim-websock instead of news #4061

Merged
merged 9 commits into from
Sep 6, 2022
Merged

use nim-websock instead of news #4061

merged 9 commits into from
Sep 6, 2022

Conversation

etan-status
Copy link
Contributor

news has a few open issues that are not present in nim-websock:

  1. There is a 1 second delay between each MB of sent data.
  2. Cancelling an ongoing send makes the entire WebSocket unusable.
  3. Control packets do not have priority over ongoing message frames.

Using news, there are quite a few of these messages in Geth:

Previously seen beacon client is offline. Please ensure it is
operational to follow the chain!

It may take quite some time to reconnect when this happens.

Using nim-websock, this message still occurs because eth1_monitor
reconnects the EL connection when no new blocks occurred for 5 minutes,
but reconnecting is quick and the message is rarer.

`news` has a few open issues that are not present in `nim-websock`:
1. There is a 1 second delay between each MB of sent data.
2. Cancelling an ongoing `send` makes the entire WebSocket unusable.
3. Control packets do not have priority over ongoing message frames.

Using `news`, there are quite a few of these messages in Geth:
```
Previously seen beacon client is offline. Please ensure it is
operational to follow the chain!
```
It may take quite some time to reconnect when this happens.

Using `nim-websock`, this message still occurs because `eth1_monitor`
reconnects the EL connection when no new blocks occurred for 5 minutes,
but reconnecting is quick and the message is rarer.
@etan-status
Copy link
Contributor Author

etan-status commented Sep 2, 2022

Manual sepolia genesis sync ✅

Notably, there was not a single logged Transport is not initialised (missing a call to connect?) on the Nimbus side.

@github-actions
Copy link

github-actions bot commented Sep 2, 2022

Unit Test Results

       12 files  ±0       860 suites  ±0   1h 10m 38s ⏱️ + 12m 4s
  1 982 tests ±0    1 835 ✔️ ±0  147 💤 ±0  0 ±0 
10 662 runs  ±0  10 472 ✔️ ±0  190 💤 ±0  0 ±0 

Results for commit 4c6eb2d. ± Comparison against base commit f33a241.

♻️ This comment has been updated with latest results.

@etan-status
Copy link
Contributor Author

Manual Goerli tests around head:

  • No Transport is not initialised (missing a call to connect?) on Nimbus side
  • A few Previously seen beacon client is offline, delays to next log message (assuming reconnected):
    • 3 seconds
    • 17 seconds
    • 4 seconds
    • 2 seconds
    • 4 seconds
    • 16 seconds
    • 3 seconds
    • 2 seconds
    • 2 seconds
    • 1 second
    • 2 seconds
    • 7 seconds
    • 3 seconds
    • 5 seconds
    • 2 seconds
    • 3 seconds
    • 7 seconds
    • 3 seconds
    • 4 seconds
    • 4 seconds
    • 2 seconds
    • 2 seconds
    • 7 seconds
    • 2 seconds
    • 4 seconds
    • 3 seconds

Will run it for longer to better understand the nature of those reconnects.

@etan-status
Copy link
Contributor Author

Generally, Geth's Previously seen beacon client is offline does not seem to indicate connection failure. The message appears even though no new WebSocket connection is established, so I suspect that it can be spurious.

Screenshot 2022-09-03 at 12 30 02

@etan-status
Copy link
Contributor Author

Example how it looks on Geth side — notably, no block was skipped

Screenshot 2022-09-03 at 13 06 21

@etan-status
Copy link
Contributor Author

etan-status commented Sep 4, 2022

There was a single disconnect now, due to one block being received too early, and then failing to obtain it for 5 minutes. During this, Geth did not make progress for 5 minutes, so eth1_monitor noticed that there were no new blocks and reconnected the WebSocket. This is unrelated to this PR, but still worth noting. nim-websock still seems perfectly stable.

Block 3814140 was received on 2022-09-04 09:47:59.460+02:00 but the expected time is Sep 04 2022 09:48:00 GMT+0200, so it was ignored as being received more than 0.5 seconds (MAXIMUM_GOSSIP_CLOCK_DISPARITY) early.

DBG 2022-09-04 09:47:48.905+02:00 newPayload: succeeded                      parentHash=310015e7 blockHash=dde18504 blockNumber=7527436 payloadStatus=VALID

DBG 2022-09-04 09:47:59.460+02:00 Dropping block                             topics="beacnde" error="(Ignore, \"BeaconBlock: slot too high\")" blockRoot=c6afbd53 wallSlot=3814139 signature=924b5d1c blck="(slot: 3814140, proposer_index: 345415, parent_root: \"a89eb79f\", state_root: \"29afc597\", eth1data: (deposit_root: b136cac77fba715ed1cff147d8d7b8835d1f2dab1700c875845c38e4987422a4, deposit_count: 179385, block_hash: d9ce6b9dc0ffb3d621cd6af1a1eead76762d5861fb44a3a04a2a3907ccb76186), graffiti: \"grand rising\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 128, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 0)"

DBG 2022-09-04 09:48:03.659+02:00 Dropping aggregate                         topics="gossip_eth2" error="(Ignore, \"Attestation block unknown\")" aggregate="(aggregation_bits: 0b00101100010110011001000001011101101011011110010100011100100000111111110010001110101011011000001110001111011000000010101110100100100000000100001110101100010011110000100101100110001001001, data: (slot: 3814140, index: 1, beacon_block_root: \"c6afbd53\", source: \"119190:0b7f69a5\", target: \"119191:8772c8ee\"), signature: \"b26a0994\")" wallSlot=3814140 aggregator_index=343870 signature=824741e9 selection_proof=93bd8a6e

DBG 2022-09-04 09:48:04.494+02:00 Requesting detected missing blocks         topics="beacnde" blocks=[c6afbd53]
DBG 2022-09-04 09:48:04.494+02:00 Requesting blocks by root                  topics="requman" peer=16U*umXRbm blocks=[c6afbd53] peer_score=950
DBG 2022-09-04 09:48:04.509+02:00 Requesting blocks by root                  topics="requman" peer=16U*EiXVaj blocks=[c6afbd53] peer_score=1000

DBG 2022-09-04 09:48:11.570+02:00 Requesting detected missing blocks         topics="beacnde" blocks=[c6afbd53]

DBG 2022-09-04 09:48:13.010+02:00 Dropping block                             topics="beacnde" error="(Ignore, \"BeaconBlock: Parent not found\")" blockRoot=d85f4cce wallSlot=3814141 signature=90e497e9 blck="(slot: 3814141, proposer_index: 314363, parent_root: \"c6afbd53\", state_root: \"e2cdd885\", eth1data: (deposit_root: b136cac77fba715ed1cff147d8d7b8835d1f2dab1700c875845c38e4987422a4, deposit_count: 179385, block_hash: d9ce6b9dc0ffb3d621cd6af1a1eead76762d5861fb44a3a04a2a3907ccb76186), graffiti: \"Lighthouse/v3.1.0-aa022f4\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 128, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 443)"

DBG 2022-09-04 09:48:14.512+02:00 Request manager tick                       topics="requman" blocks_count=1 succeed=2 failed=0 queue_size=3 sync_speed=0.09983028850953379
DBG 2022-09-04 09:48:14.512+02:00 Requesting blocks by root                  topics="requman" peer=16U*umXRbm blocks="[c6afbd53, c6afbd53, c6afbd53]" peer_score=850
DBG 2022-09-04 09:48:14.512+02:00 Requesting blocks by root                  topics="requman" peer=16U*EiXVaj blocks="[c6afbd53, c6afbd53, c6afbd53]" peer_score=1000

DBG 2022-09-04 09:48:24.210+02:00 Dropping block                             topics="beacnde" error="(Ignore, \"BeaconBlock: Parent not found\")" blockRoot=3392e160 wallSlot=3814142 signature=ab8b42d6 blck="(slot: 3814142, proposer_index: 176693, parent_root: \"d85f4cce\", state_root: \"c684f1e2\", eth1data: (deposit_root: b136cac77fba715ed1cff147d8d7b8835d1f2dab1700c875845c38e4987422a4, deposit_count: 179385, block_hash: d9ce6b9dc0ffb3d621cd6af1a1eead76762d5861fb44a3a04a2a3907ccb76186), graffiti: \"Nimbus/v22.8.2-b7e4d1-stateofus\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 128, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 442)"

DBG 2022-09-04 09:48:27.854+02:00 LC optimistic update received              topics="lightcl" optimistic_update="(attested: (slot: 3814141, proposer_index: 314363, parent_root: \"c6afbd53\", state_root: \"e2cdd885\"), num_active_participants: 442, signature_slot: 3814142)"

WARN [09-04|09:48:22.862] Previously seen beacon client is offline. Please ensure it is operational to follow the chain! 

WRN 2022-09-04 09:52:49.375+02:00 Eth1 chain monitoring failure, restarting  topics="eth1" err="No eth1 chain progress for too long"

INF 2022-09-04 09:52:54.384+02:00 Starting Eth1 deposit contract monitoring  topics="eth1" contract=0xff50ed3d0ec03ac01d4c79aad74928bff48a7b2b

WARN [09-04|09:53:22.938] Beacon client online, but no consensus updates received in a while. Please fix your beacon client to follow the chain! 

DBG 2022-09-04 09:54:21.411+02:00 newPayload: inserting block into execution engine parentHash=5a0e84bb blockHash=e036230b stateRoot=dca0582e receiptsRoot=50ea236e prevRandao=35b575cf blockNumber=7527464 gasLimit=30000000 gasUsed=3840195 timestamp=1662278052 extraDataLen=0 baseFeePerGas=8 numTransactions=44

DBG 2022-09-04 09:54:26.582+02:00 Block resolved                             blockRoot=c6afbd53 blck="(slot: 3814140, proposer_index: 345415, parent_root: \"a89eb79f\", state_root: \"29afc597\", eth1data: (deposit_root: b136cac77fba715ed1cff147d8d7b8835d1f2dab1700c875845c38e4987422a4, deposit_count: 179385, block_hash: d9ce6b9dc0ffb3d621cd6af1a1eead76762d5861fb44a3a04a2a3907ccb76186), graffiti: \"grand rising\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 128, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 0)" blockVerified=true heads=2 stateDataDur=336us259ns sigVerifyDur=26ms468us887ns stateVerifyDur=176ms813us771ns putBlockDur=2ms375us909ns epochRefDur=6us343ns

DBG 2022-09-04 09:54:26.664+02:00 Updated head block                         topics="chaindag" stateRoot=29afc597 justified=119190:0b7f69a5 finalized=119189:448b945c isOptHead=false newHead=c6afbd53:3814140 lastHead=a89eb79f:3814139

DBG 2022-09-04 09:54:26.952+02:00 Block resolved                             blockRoot=d85f4cce blck="(slot: 3814141, proposer_index: 314363, parent_root: \"c6afbd53\", state_root: \"e2cdd885\", eth1data: (deposit_root: b136cac77fba715ed1cff147d8d7b8835d1f2dab1700c875845c38e4987422a4, deposit_count: 179385, block_hash: d9ce6b9dc0ffb3d621cd6af1a1eead76762d5861fb44a3a04a2a3907ccb76186), graffiti: \"Lighthouse/v3.1.0-aa022f4\", proposer_slashings_len: 0, attester_slashings_len: 0, attestations_len: 128, deposits_len: 0, voluntary_exits_len: 0, sync_committee_participants: 443)" blockVerified=true heads=2 stateDataDur=9ms532us171ns sigVerifyDur=46ms861us998ns stateVerifyDur=178ms249us343ns putBlockDur=297us882ns epochRefDur=6us169ns

@etan-status
Copy link
Contributor Author

sepolia genesis sync done once more on the latest commit. no disconnects after EL had state downloaded. Before that, disconnects were induced by eth1_monitor, so not a websocket library issue.

@etan-status
Copy link
Contributor Author

Prolonged goerli test once more with no disconnects to the EL.

@etan-status etan-status enabled auto-merge (squash) September 6, 2022 09:10
@etan-status etan-status merged commit 634408f into unstable Sep 6, 2022
@etan-status etan-status deleted the dev/etan/rc-websock branch September 6, 2022 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants