Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing issue after v0.7.0 #3837

Closed
jennijuju opened this issue Sep 14, 2020 · 9 comments
Closed

Syncing issue after v0.7.0 #3837

jennijuju opened this issue Sep 14, 2020 · 9 comments
Labels
area/chain Area: Chain kind/discussion Kind: Discussion

Comments

@jennijuju
Copy link
Member

jennijuju commented Sep 14, 2020

This is a ticket to collect various syncing issue from miners.

Information to provide:

  • What's your machine spec?
  • What issue did you run into? Do you see any error message, if so, which error message(s) did you see?
  • Did it happen before 0.7.0?

Feel free to add anything that you think can be helpful!

@jennijuju jennijuju added the kind/discussion Kind: Discussion label Sep 14, 2020
@jimpick
Copy link
Contributor

jimpick commented Sep 14, 2020

I just experienced a node sync issue in the past hour. It's a 16MB machine for making deals - I'm not running a miner. After upgrading to master at 86607452d7bbe139520aebcb1b9abb81f1a8aac6, it appeared to be stuck. Two other identical nodes I upgraded at the same time did manage to sync. After restarting the node, it synced again.

@mishmosh
Copy link
Contributor

mishmosh commented Sep 14, 2020

The protofire team noticed their 10 client nodes falling out of sync when low on diskspace. via @magik6k:

oof, we really need to make sure running out of disk space isn't breaking the chainstore.

@kenshyx
Copy link
Contributor

kenshyx commented Sep 14, 2020

Every time when I submit a message there is a high chance that my chain gets corrupted. This was happening also before the version v0.7 but now it's more often.
I'm running the lotus node and the miner on a dual intel cpu with 32 cores/512gb ram and for storage I have a couple of seagate exos x16.

{"level":"error","ts":"2020-09-14T22:55:30.543+0200","logger":"chain","caller":"chain/sync_manager.go:389","msg":"sync error: collectChain failed:\n github.com/filecoin-project/lotus/chain.(*Syncer).Sync\n /github/lotus/chain/sync.go:570\n - chain linked to block marked previously as bad ([bafy2bzacebszbfiaccvaxdymfwu4kv4ldr4xv5g37mkqt7tkf5udbu2tv2twk], bafy2bzacebylzuuus54eq3gboopk7s34vik6ny7tvffhmp5at3pmmyfwgmvke) (reason: linked to bafy2bzacecs5n76cqwppdinuwoaufxe7cjhsw6h47lsj3b33wzwxqcw63vwv4 caused by: [bafy2bzaceaz4kd4w7wkfeqiavarvpk3ul2u72sxt4a2k3urvtracxmugpvtlc] parent state root did not match computed state (bafy2bzacecfl7hog2au3rjl4tz5myaivzwbiyeh2rx2dygphq7gntzcqiakte != bafy2bzacebat4ah2jxnelsoapr77vh45eyicjcryrph66jhb3ivrtratb7plo)):\n github.com/filecoin-project/lotus/chain.(*Syncer).collectHeaders\n /github/lotus/chain/sync.go:1240"}

@Brian44913

This comment has been minimized.

@Brian44913

This comment has been minimized.

@jennijuju
Copy link
Member Author

related : #3840

@jennijuju jennijuju added the area/chain Area: Chain label Sep 15, 2020
@jennijuju

This comment has been minimized.

@jennijuju
Copy link
Member Author

from 可祤 AMD 7452 ,566G Mem, 2T SSD, the error is chain linked to block marked previously as bad and chain linked to block marked previously as bad

@creaz79
Copy link

creaz79 commented Sep 15, 2020

I have noticed, that if i run as systemd service (default service file) - there is a chance - * failed to validate blocks random beacon values and after it - block linked to block marked as bad...
But after it if i stop service and run "lotus daemon" - no error, sync go further, but as service it will never pass this block

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/chain Area: Chain kind/discussion Kind: Discussion
Projects
None yet
Development

No branches or pull requests

6 participants