"State heal in progress" after sync forever #1198

DaveWK · 2022-11-22T15:59:37Z

System information

Geth version: 1.17
OS & Version: Linux, Fedora 37

Expected behaviour

Finishes syncing and able to use RPC

Actual behaviour

Never seems to finish syncing, keeps saying

 t=2022-11-22T15:57:45+0000 lvl=info msg="State heal in progress"                 accounts=10,522,[email protected] slots=22,809,[email protected]      [email protected]      nodes=135,122,[email protected] pending=61470

in logs

Steps to reproduce the behaviour

Node is an AWS c6a.8xlarge Plenty of disk IO and space.

The text was updated successfully, but these errors were encountered:

forcodedancing · 2022-11-23T03:25:07Z

@DaveWK tDo you sync for the first time? Do you use snapshot https://github.com/bnb-chain/bsc-snapshots ?

DaveWK · 2022-11-23T03:42:06Z

Synced from genesis using --syncmode=snap; has been syncing for 4 days. in the logs I see:

t=2022-11-23T03:38:10+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="500.329µs" number=23,293,067 hash=0x621e9128bad96d6db5bc8e5682abb77f3d5156193a1ffdb327bdfe65ddc65c23
t=2022-11-23T03:38:13+0000 lvl=info msg="State heal in progress"                 accounts=736,[email protected] slots=1,730,[email protected] [email protected] nodes=151,785,[email protected] pending=51447
t=2022-11-23T03:38:13+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="617.301µs" number=23,293,068 hash=0x3f71527b518f67ab546b5299a6b5ce3d88c7889fc822cce01e123b7f914473dc
t=2022-11-23T03:38:17+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="493.629µs" number=23,293,069 hash=0x17e9497ba49b0bef872fda370f1c086a4c14a4ed4bc91c4e7bef2ed27bc29d15
t=2022-11-23T03:38:20+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="502.819µs" number=23,293,070 hash=0x083121f3f33df93604b401d3c8309737c99d0336492f0f9a38dc7c1e1303fe22
t=2022-11-23T03:38:21+0000 lvl=info msg="State heal in progress"                 accounts=736,[email protected] slots=1,731,[email protected] [email protected] nodes=151,790,[email protected] pending=49316
t=2022-11-23T03:38:23+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="621.872µs" number=23,293,071 hash=0x373afe1bc7af546b20c223e6c2e08c172dc7d3558628e8c518d683494e7a14f1
t=2022-11-23T03:38:26+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="650.561µs" number=23,293,072 hash=0x9fd4b29e214c6eb26d2cc1d787047b6660829218f504fa46eb572b58d049d1e0
t=2022-11-23T03:38:27+0000 lvl=info msg="Downloader queue stats"                 receiptTasks=0 blockTasks=0 itemSize="148.44 KiB" throttle=1766
t=2022-11-23T03:38:29+0000 lvl=info msg="State heal in progress"                 accounts=737,[email protected] slots=1,733,[email protected] [email protected] nodes=151,795,[email protected] pending=47242
t=2022-11-23T03:38:29+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="738.533µs" number=23,293,073 hash=0x184d87f3c62f9a19167ccb6105b61f658e8c6cf287d5a8fbbdae39a2d15d4324
t=2022-11-23T03:38:30+0000 lvl=warn msg="Pivot became stale, moving"             old=23,292,947 new=23,293,011
t=2022-11-23T03:38:30+0000 lvl=info msg="Imported new block receipts"            count=64  elapsed=87.428ms    number=23,293,010 hash=0x7d43a4f453525c0ea0a32d25aa923538443e0c228991a8594ab095b3c317e677 age=3m17s    size="5.22 MiB"
t=2022-11-23T03:38:30+0000 lvl=info msg="State heal in progress"                 accounts=737,[email protected] slots=1,733,[email protected] [email protected] nodes=151,795,[email protected] pending=47156
t=2022-11-23T03:38:32+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="662.783µs" number=23,293,074 hash=0x4b9d967ce1a722bce0fc9e3972fa334b2b5148a958b2bf569d1a51ad8ea590b5
t=2022-11-23T03:38:36+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="528.669µs" number=23,293,075 hash=0xb45ce20377912c88a2efe944129667d4dfdda0741f32f9edfddbfd82e43db5e2
t=2022-11-23T03:38:39+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="525.979µs" number=23,293,076 hash=0xe3d9fb0330ecd59a9a1dd0c3112636f68c73f4e0e0c9b776acdd09db6bf8e232
t=2022-11-23T03:38:39+0000 lvl=info msg="State heal in progress"                 accounts=737,[email protected] slots=1,733,[email protected] [email protected] nodes=151,800,[email protected] pending=19681
t=2022-11-23T03:38:42+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="789.245µs" number=23,293,077 hash=0xd7509d939399f650e0a4ecf360bc9fac143b8fcbbfa97859d3aca279079e9446
t=2022-11-23T03:38:45+0000 lvl=info msg="Imported new block headers"             count=1   elapsed="417.178µs" number=23,293,078 hash=0x07cf42df33a19d63032dc11392167a8c17d0824147c4cc500eec4cb21b9e2535
t=2022-11-23T03:38:47+0000 lvl=info msg="State heal in progress"                 accounts=737,[email protected] slots=1,733,[email protected] [email protected] nodes=151,802,[email protected] pending=23696

using 1.7 TB disk space. I have moved the --syncmode param to full after the initial sync and restarted, but after the initialization logs pass it returns to the endless state heal loop.

0xChupaCabra · 2022-11-24T15:21:25Z

Same here on a 48 cores server, load avg less than 10%.

free -g
               total        used        free      shared  buff/cache   available
Mem:             754         316           6           0         431         432

In other discussions it has been said it is because server has not enough resources to catchup.
I need to sync from scratch as the snapshot has ancient data pruned.

mosinb · 2022-11-24T16:33:13Z

state heal

Hi, the state heal seems like a loop but usually it is not. This process can take time to finish, sometimes days.

Same here on a 48 cores server, load avg less than 10%.
free -g
               total        used        free      shared  buff/cache   available
Mem:             754         316           6           0         431         432
In other discussions it has been said it is because server has not enough resources to catchup. I need to sync from scratch as the snapshot has ancient data pruned.

Hi, you have enough memory for sure, try increasing the --cache to a higher value. What is your IOPS of your storage? and what is the reason you need all the ancient data? Do you need all the historical data of the Blockchain? If you do so you may need to run an Archive node.

Also for this type of questions I would recommend to reach out on Discord for faster response: https://discord.gg/bnbchain

0xChupaCabra · 2022-11-24T16:38:18Z

state heal

Hi, the state heal seems like a loop but usually it is not. This process can take time to finish, sometimes days.
Same here on a 48 cores server, load avg less than 10%.
free -g
               total        used        free      shared  buff/cache   available
Mem:             754         316           6           0         431         432
In other discussions it has been said it is because server has not enough resources to catchup. I need to sync from scratch as the snapshot has ancient data pruned.
Hi, you have enough memory for sure, try increasing the --cache to a higher value. What is your IOPS of your storage? and what is the reason you need all the ancient data? Do you need all the historical data of the Blockchain? If you do so you may need to run an Archive node.

Also for this type of questions I would recommend to reach out on Discord for faster response: https://discord.gg/bnbchain

Thanks for the suggestion. I just deleted the whole chain and restarted the node with --syncmode full option.
The server has 7TB NVMe drives. On another one 14TB SSDs.

DaveWK · 2022-11-25T01:43:46Z

This has happened to me before in ehtereum geth recently (v1.10.25), and the suggestion was to use the rolling v1.11 version:
ethereum/go-ethereum#25865

After failing and stuck in state heal on v1.10.25, can confirm 1.11.0-unstable-1daea030 worked. You may be able to find the bug fix or improvement by bisecting the go-ethereum code

DaveWK · 2022-11-25T02:01:31Z

I did a little bit of looking, and think I have narrowed it down to these 3 commits:
ethereum/go-ethereum#25651
ethereum/go-ethereum#25666
ethereum/go-ethereum#25694

forcodedancing · 2022-11-28T02:36:44Z

@DaveWK Thanks, we will look into it.

forcodedancing · 2022-12-03T12:43:58Z

@DaveWK PR has been created, #1226 Thanks for your reporting and analysis.

jacobpake · 2022-12-12T09:10:35Z

@DaveWK, have you tried with the PR? I'm running it now, on 3rd day of State Heal

forcodedancing self-assigned this Nov 23, 2022

forcodedancing added the question Further information is requested label Nov 23, 2022

This was referenced Nov 28, 2022

Fully synced from genesis - restarted node, now stuck in state heal #1205

Closed

Suggestion for running BSC nodes #875

Closed

ZiyuanStar mentioned this issue Jan 18, 2023

Endless "State heal in progress" #1277

Closed

brilliant-lx mentioned this issue Jan 31, 2023

SnapSync failed at: State Heal #1284

Closed

forcodedancing closed this as completed Mar 20, 2023

weiihann added the X-nodesync task filter for node sync issue: full, snap, light... label Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"State heal in progress" after sync forever #1198

"State heal in progress" after sync forever #1198

DaveWK commented Nov 22, 2022

forcodedancing commented Nov 23, 2022

DaveWK commented Nov 23, 2022

0xChupaCabra commented Nov 24, 2022

mosinb commented Nov 24, 2022

0xChupaCabra commented Nov 24, 2022

DaveWK commented Nov 25, 2022 •

edited

Loading

DaveWK commented Nov 25, 2022

forcodedancing commented Nov 28, 2022

forcodedancing commented Dec 3, 2022

jacobpake commented Dec 12, 2022

"State heal in progress" after sync forever #1198

"State heal in progress" after sync forever #1198

Comments

DaveWK commented Nov 22, 2022

System information

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

forcodedancing commented Nov 23, 2022

DaveWK commented Nov 23, 2022

0xChupaCabra commented Nov 24, 2022

mosinb commented Nov 24, 2022

0xChupaCabra commented Nov 24, 2022

DaveWK commented Nov 25, 2022 • edited Loading

DaveWK commented Nov 25, 2022

forcodedancing commented Nov 28, 2022

forcodedancing commented Dec 3, 2022

jacobpake commented Dec 12, 2022

DaveWK commented Nov 25, 2022 •

edited

Loading