-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: remove legacy orphans background #863
Conversation
WalkthroughThe updates involve refactoring the pruning logic of a Go application's V1. The changes focus on improving the handling of orphan nodes by introducing a new method to delete them and altering the existing pruning function to work asynchronously. This approach aims to enhance performance by preventing the main thread I/O from being blocked, and it also adds the flexibility to manually trigger the pruning process via an exposed API. Changes
Assessment against linked issues
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ? TipsChat with CodeRabbit Bot (
|
go func() { | ||
if err := ndb.deleteOrphans(); err != nil { | ||
ndb.logger.Error("failed to clean legacy orphans", "err", err) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of a goroutine to call deleteOrphans
is a good approach to avoid blocking the main thread. However, error handling within the goroutine should be improved. Currently, if an error occurs, it is only logged. Consider adding a mechanism to handle these errors more robustly, such as a channel to communicate errors back to the main thread or a callback.
count := 0 | ||
for ; itr.Valid(); itr.Next() { | ||
if err := ndb.batch.Delete(itr.Key()); err != nil { | ||
return err | ||
} | ||
|
||
// Sleep for a while to avoid blocking the main thread i/o. | ||
count++ | ||
if count > 1000 { | ||
count = 0 | ||
time.Sleep(100 * time.Millisecond) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sleep mechanism in deleteOrphans
is implemented to avoid blocking the main thread I/O by sleeping after every 1000 deletions. This is a simple rate-limiting strategy, but it may not be optimal. Consider using a more dynamic approach, such as monitoring the system's load or I/O wait time, to adjust the sleep duration or the number of operations between sleeps.
Is this concurrency safe with main thread commits? (This can add to deleting while the main thread is committing right?) Do we need a mutex to ensure the two don't try to add to batch at the same time? |
yeah, the test is failed with data racing |
// the batch is flushed to disk, cleared, and a new one is created with buffer pre-allocated to threshold. | ||
// The addition entry is then added to the batch. | ||
func (b *BatchWithFlusher) Set(key, value []byte) error { | ||
b.mtx.Lock() | ||
defer b.mtx.Unlock() | ||
|
||
batchSizeAfter, err := b.estimateSizeAfterSetting(key, value) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [49-81]
Consider protecting Write
, WriteSync
, Close
, and GetByteSize
methods with the mutex to prevent potential race conditions.
This is the wrong spot to put it, we hacked something that worked in this branch https://github.com/cosmos/iavl/tree/dev/v1.0.0-orphan (the first part of the loop is what was expensive) I think the way you were doing the code is cleaner, just that was something that I got to work. Plz feel free to takeover/move anything that helps 🙏 |
close in favor of #877 |
Closes: #862
Context
BatchWithFlusher
to avoid the race condition