fix(store/v2): Fix PebbleDB Iteration Edge Cases #18948

alexanderbez · 2024-01-04T22:00:09Z

Description

Thanks Kartik for finding this edge case.

Changelog

Ensure we create a copy of PebbleDB mvccKey when decoding. This is because the key provided is re-used and can cause nasty footguns when reassigning to variables.
Fix an edge case in PebbleDB iteration where you could trigger an infinite loop

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

included the correct type prefix in the PR title
confirmed ! in the type prefix if API or client breaking change
targeted the correct branch (see PR Targeting)
provided a link to the relevant issue or specification
reviewed "Files changed" and left comments if necessary
included the necessary unit and integration tests
added a changelog entry to CHANGELOG.md
updated the relevant documentation or specification, including comments for documenting Go code
confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

confirmed the correct type prefix in the PR title
confirmed all author checklist items have been addressed
reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

coderabbitai · 2024-01-04T22:12:44Z

Walkthrough

The recent changes involve enhancements to the iteration mechanics in a storage system, specifically targeting the way iterators handle the progression to subsequent keys and versions. A notable update includes the safeguarding of key integrity in the SplitMVCCKey function, where keys are now cloned to prevent unsafe manipulations. Additionally, a new test has been introduced to verify the improved iteration behavior, particularly the ability to skip over versions when iterating through the database.

Changes

File(s)	Summary
`store/storage/pebbledb/comparator.go`, `store/storage/pebbledb/iterator.go`	Modified the `Next` method for better handling of key/version iteration and enforced key safety by cloning keys in `SplitMVCCKey`.
`store/storage/storage_test_suite.go`	Added `TestDatabaseIterator_SkipVersion` to validate the skipping of versions during iteration.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Note: Auto-reply has been disabled for this repository by the repository owner. The CodeRabbit bot will not respond to your comments unless it is explicitly tagged.

Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository from git and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

github-actions · 2024-01-04T22:12:51Z

@alexanderbez your pull request is missing a changelog!

store/storage/storage_test_suite.go

store/storage/pebbledb/comparator.go

store/storage/pebbledb/iterator.go

tac0turtle · 2024-01-05T16:23:31Z

could you touch on the edge case being fixed here in the description. Helps with future readers

alexanderbez · 2024-01-05T17:28:13Z

could you touch on the edge case being fixed here in the description. Helps with future readers

Yeah, so essentially @Kbhat1 and the team have been doing extensive battle testing of the SS PebbleDB backend and came across an edge case of where iteration would go into an infinite loop. The edge case arises from a specific set of keys and the versions they're set on.

@Kbhat1 could you expand more in it?

…bez/pebble-fixes

coderabbitai · 2024-01-08T19:33:57Z

store/storage/storage_test_suite.go

+func (s *StorageTestSuite) TestDatabaseIterator_SkipVersion() {
+	db, err := s.NewDB(s.T().TempDir())
+	s.Require().NoError(err)
+
+	defer db.Close()
+
+	cs := store.NewChangeset(map[string]store.KVPairs{storeKey1: {
+		{Key: []byte("keyC"), Value: []byte("value003")},
+	}})
+	s.Require().NoError(db.ApplyChangeset(58827506, cs))
+
+	cs = store.NewChangeset(map[string]store.KVPairs{storeKey1: {
+		{Key: []byte("keyE"), Value: []byte("value000")},
+	}})
+	s.Require().NoError(db.ApplyChangeset(58827506, cs))
+
+	cs = store.NewChangeset(map[string]store.KVPairs{storeKey1: {
+		{Key: []byte("keyF"), Value: []byte("value000")},
+	}})
+	s.Require().NoError(db.ApplyChangeset(58827506, cs))
+
+	cs = store.NewChangeset(map[string]store.KVPairs{storeKey1: {
+		{Key: []byte("keyC"), Value: []byte("value004")},
+	}})
+	s.Require().NoError(db.ApplyChangeset(58833605, cs))
+
+	cs = store.NewChangeset(map[string]store.KVPairs{storeKey1: {
+		{Key: []byte("keyD"), Value: []byte("value006")},
+	}})
+	s.Require().NoError(db.ApplyChangeset(58833606, cs))
+
+	itr, err := db.Iterator(storeKey1, 58831525, []byte("key"), nil)
+	s.Require().NoError(err)
+	defer itr.Close()
+
+	count := make(map[string]struct{})
+	for ; itr.Valid(); itr.Next() {
+		count[string(itr.Key())] = struct{}{}
+	}
+
+	s.Require().Equal(3, len(count))
+}


The ast-grep command has provided the loop's body within the TestDatabaseIterator_SkipVersion function. The test increments a count based on the keys returned by the iterator, but there are no assertions on the values or explicit checks for skipped versions. This confirms the points raised in the review comment about the need for additional assertions to verify the correct key-value pairs and to ensure versions are being skipped as expected.

Analysis chain

The new test TestDatabaseIterator_SkipVersion has been added to ensure the iterator correctly skips versions. However, there are a few points to consider:

The test applies several changesets to the same version (58827506). This seems unusual as typically each changeset would correspond to a unique version. Verify if this is intentional and correct.

The test does not assert the expected values of the keys, only the count of unique keys. It would be more robust to also check the values to ensure the iterator is not only returning the correct number of keys but also the correct key-value pairs.

The test name and the keys used (keyC, keyE, keyF) suggest that it is testing the skipping of versions, but there is no explicit check that versions are being skipped. It would be beneficial to add assertions that verify the iterator is indeed skipping over certain versions as expected.

The iterator is created with a start key of []byte("key"), which does not match the pattern of the keys inserted (keyC, keyE, keyF). This could be intentional to test prefix iteration, but it should be clarified or corrected if it's a mistake.

The ast-grep command to verify the application of multiple changesets to the same version number 58827506 returned matches, confirming that this is indeed the case in the test. However, the second ast-grep command failed due to a syntax error. To proceed, we need to correct the pattern and rerun the command to verify the iterator's behavior in the test.

The ast-grep command has confirmed the presence of itr.Valid() within the TestDatabaseIterator_SkipVersion function. To understand the assertions made within the loop, we need to capture the body of the loop. Let's adjust the ast-grep pattern to do so.

The ast-grep command did not produce the expected output, which suggests a potential issue with the complexity of the pattern. To capture the body of the loop within the TestDatabaseIterator_SkipVersion function, we need to simplify the pattern.

The attempt to use the --within flag with ast-grep was not successful. We need to revert to a simpler pattern that directly targets the loop's body within the TestDatabaseIterator_SkipVersion function.

Scripts Executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Verify that applying multiple changesets to the same version is correct. ast-grep --lang go --pattern $'db.ApplyChangeset($_, $_)' | grep '58827506' # Verify that the iterator is correctly skipping versions by checking the presence of specific key-value pairs. ast-grep --lang go --pattern $'itr.Valid(); itr.Next() { $$$ }' | grep 'TestDatabaseIterator_SkipVersion'

Length of output: 921

Script:

#!/bin/bash # Correct the ast-grep pattern to verify the iterator's behavior in the test. ast-grep --lang go --pattern $'func ($_ *StorageTestSuite) TestDatabaseIterator_SkipVersion() { $$$ }' | grep 'itr.Valid'

Length of output: 195

Script:

#!/bin/bash # Adjust the ast-grep pattern to capture the body of the loop within the TestDatabaseIterator_SkipVersion function. ast-grep --lang go --pattern $'func ($_ *StorageTestSuite) TestDatabaseIterator_SkipVersion() { $$$ for ; itr.Valid(); itr.Next() { $$$ } $$$ }'

Length of output: 145

Script:

#!/bin/bash # Simplify the ast-grep pattern to focus on the loop's body within the TestDatabaseIterator_SkipVersion function. ast-grep --lang go --pattern $'for ; itr.Valid(); itr.Next() { $$$ }' --within $'func ($_ *StorageTestSuite) TestDatabaseIterator_SkipVersion() { $$$ }'

Length of output: 352

Script:

#!/bin/bash # Attempt a simpler ast-grep pattern to directly target the loop's body within the TestDatabaseIterator_SkipVersion function. ast-grep --lang go --pattern $'for ; itr.Valid(); itr.Next() { $$$ }'

Length of output: 16366

coderabbitai · 2024-01-09T18:38:26Z

store/storage/pebbledb/iterator.go

+	currKey, _, ok := SplitMVCCKey(itr.source.Key())
+	if !ok {
+		// XXX: This should not happen as that would indicate we have a malformed
+		// MVCC key.
+		panic(fmt.Sprintf("invalid PebbleDB MVCC key: %s", itr.source.Key()))
+	}


The use of panic should be replaced with proper error handling to prevent application crashes due to malformed MVCC keys.

- panic(fmt.Sprintf("invalid PebbleDB MVCC key: %s", itr.source.Key())) + // Handle error properly instead of panic + return fmt.Errorf("invalid PebbleDB MVCC key: %s", itr.source.Key())

Committable suggestion

❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

currKey, _, ok := SplitMVCCKey(itr.source.Key())

if !ok {

// XXX: This should not happen as that would indicate we have a malformed

// MVCC key.

panic(fmt.Sprintf("invalid PebbleDB MVCC key: %s", itr.source.Key()))

}

currKey, _, ok := SplitMVCCKey(itr.source.Key())

if !ok {

// XXX: This should not happen as that would indicate we have a malformed

// MVCC key.

// Handle error properly instead of panic

return fmt.Errorf("invalid PebbleDB MVCC key: %s", itr.source.Key())

}

The panic in the Next method when encountering a malformed MVCC key is a drastic measure that could cause the entire application to crash. Consider replacing this with proper error handling that allows the application to recover gracefully from such an error.

coderabbitai · 2024-01-09T18:38:26Z

store/storage/pebbledb/iterator.go

 			return
 		}

-		// Move the iterator to the closest version to the desired version, so we
+		// Move the iterator to the closest version of the desired version, so we
 		// append the current iterator key to the prefix and seek to that key.
 		itr.valid = itr.source.SeekLT(MVCCEncode(nextKey, itr.version+1))

+		tmpKey, _, ok := SplitMVCCKey(itr.source.Key())
+		if !ok {
+			// XXX: This should not happen as that would indicate we have a malformed
+			// MVCC key.
+			itr.valid = false
+			return
+		}
+
+		// There exists cases where the SeekLT() call moved us back to the same key
+		// we started at, so we must move to next key, i.e. two keys forward.
+		if bytes.Equal(tmpKey, currKey) {
+			if itr.source.NextPrefix() {
+				itr.Next()
+			} else {
+				itr.valid = false
+				return
+			}
+		}
+
 		// The cursor might now be pointing at a key/value pair that is tombstoned.
 		// If so, we must move the cursor.
 		if itr.valid && itr.cursorTombstoned() {


Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [135-171]

The logic for handling reverse iteration and seeking to the next prefix seems to be missing the actual implementation for reverse iteration, as the SeekLT method is called regardless of the itr.reverse value. This could be an oversight or incomplete implementation. The code should either implement the reverse iteration logic or remove the itr.reverse check if reverse iteration is not supported.

coderabbitai · 2024-01-09T18:38:26Z

store/storage/pebbledb/iterator.go

 	// First move the iterator to the next prefix, which may not correspond to the
 	// desired version for that key, e.g. if the key was written at a later version,
-	// so we seek back to the latest desired version, s.t. the version is <= itr.version.
+	// so we seek back to the latest desired version, s.t. the version <= itr.version.
 	if next {
 		nextKey, _, ok := SplitMVCCKey(itr.source.Key())
 		if !ok {


The logic for handling reverse iteration and seeking to the next prefix seems to be missing the actual implementation for reverse iteration, as the SeekLT method is called regardless of the itr.reverse value. This could be an oversight or incomplete implementation. The code should either implement the reverse iteration logic or remove the itr.reverse check if reverse iteration is not supported.

facundomedica

utACK, for other reviewers: bytes.Clone == Sei's utils.Clone (it was introduced in Go 1.20, they might be a version behind or something)

updates

c00509e

github-actions bot added the C:Store label Jan 4, 2024

alexanderbez added 3 commits January 4, 2024 14:01

updates

1603129

updates

c1f8dfe

updates

ffa58c6

alexanderbez marked this pull request as ready for review January 4, 2024 22:12

alexanderbez requested a review from a team as a code owner January 4, 2024 22:12

alexanderbez requested review from facundomedica, tac0turtle and cool-develope January 4, 2024 22:12

coderabbitai bot reviewed Jan 4, 2024

View reviewed changes

store/storage/storage_test_suite.go Outdated Show resolved Hide resolved

store/storage/pebbledb/comparator.go Show resolved Hide resolved

store/storage/pebbledb/iterator.go Outdated Show resolved Hide resolved

alexanderbez enabled auto-merge January 5, 2024 03:27

Merge branch 'main' into bez/pebble-fixes

252b87e

cool-develope approved these changes Jan 8, 2024

View reviewed changes

Merge branch 'main' into bez/pebble-fixes

786827a

alexanderbez disabled auto-merge January 8, 2024 19:26

alexanderbez added 2 commits January 8, 2024 11:30

updates

525382b

Merge branch 'bez/pebble-fixes' of github.com:cosmos/cosmos-sdk into …

a2c09d9

…bez/pebble-fixes

alexanderbez marked this pull request as draft January 8, 2024 19:31

coderabbitai bot reviewed Jan 8, 2024

View reviewed changes

alexanderbez and others added 3 commits January 9, 2024 10:33

updates

c730f29

updates

ced49f3

Merge branch 'main' into bez/pebble-fixes

d7ac611

alexanderbez marked this pull request as ready for review January 9, 2024 18:35

alexanderbez enabled auto-merge January 9, 2024 18:35

coderabbitai bot reviewed Jan 9, 2024

View reviewed changes

facundomedica approved these changes Jan 10, 2024

View reviewed changes

alexanderbez added this pull request to the merge queue Jan 10, 2024

Merged via the queue into main with commit 0b12995 Jan 10, 2024
57 of 58 checks passed

alexanderbez deleted the bez/pebble-fixes branch January 10, 2024 08:56

relyt29 pushed a commit to relyt29/cosmos-sdk that referenced this pull request Jan 22, 2024

fix(store/v2): Fix PebbleDB Iteration Edge Cases (cosmos#18948)

f858f8c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(store/v2): Fix PebbleDB Iteration Edge Cases #18948

fix(store/v2): Fix PebbleDB Iteration Edge Cases #18948

alexanderbez commented Jan 4, 2024 •

edited

Loading

coderabbitai bot commented Jan 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

CodeRabbit Discord Community

github-actions bot commented Jan 4, 2024

tac0turtle commented Jan 5, 2024

alexanderbez commented Jan 5, 2024

coderabbitai bot Jan 8, 2024

coderabbitai bot Jan 9, 2024

coderabbitai bot Jan 9, 2024

coderabbitai bot Jan 9, 2024

facundomedica left a comment

fix(store/v2): Fix PebbleDB Iteration Edge Cases #18948

fix(store/v2): Fix PebbleDB Iteration Edge Cases #18948

Conversation

alexanderbez commented Jan 4, 2024 • edited Loading

Description

Changelog

Author Checklist

Reviewers Checklist

coderabbitai bot commented Jan 4, 2024 • edited Loading

Walkthrough

Changes

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

github-actions bot commented Jan 4, 2024

tac0turtle commented Jan 5, 2024

alexanderbez commented Jan 5, 2024

coderabbitai bot Jan 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Jan 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jan 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jan 9, 2024

Choose a reason for hiding this comment

facundomedica left a comment

Choose a reason for hiding this comment

alexanderbez commented Jan 4, 2024 •

edited

Loading

coderabbitai bot commented Jan 4, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)