Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: bson should handle nulls #2485

Merged
merged 9 commits into from
Jan 30, 2024
Merged

fix: bson should handle nulls #2485

merged 9 commits into from
Jan 30, 2024

Conversation

tychoish
Copy link
Contributor

@tychoish tychoish commented Jan 24, 2024

Closes #2484

f.write(bson.encode({"a": None}))

with glaredb_connection.cursor() as curr:
curr.execute(f"select count(*) from '{data_path}'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the count(*) query was previously working, We should explicitly check the values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this test without the patch and it failed.

Because this uses the streaming bson table/file code, which uses the same bson-to-recordbatch as mongodb, we would expect the count to break here too.

@universalmind303 universalmind303 linked an issue Jan 24, 2024 that may be closed by this pull request
@tychoish tychoish requested a review from scsmithr January 29, 2024 20:40
Copy link
Contributor

@universalmind303 universalmind303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add the example in the original issue as a slt.

#2484

@tychoish
Copy link
Contributor Author

I think adding this as an SLT is pretty fragile and doesn't provide a lot of value. The code gets exercised. Given the quality of the mongo docker container we use and our mechanisms for getting data into it for SLTs fixtures, it seems precarious. I'm also sort of dubious about keeping the SLT fixtures in a shell script that's pretty far away from the actual tests in question. (local testing is hard, knowing where to fix it is hard, shell scripts like this are kind of grenades, etc.)

I think there's a project to be had about adding better support for fixtures in the SLTs, though I'm not sure where it would make sense to prioritize. And in light of that, I'm not convinced that we need to bodge in tests just because...

@universalmind303
Copy link
Contributor

I think adding this as an SLT is pretty fragile and doesn't provide a lot of value.

I disagree. As a general rule, If an issue is opened with a specific reproducible example, I include that example in the tests to ensure that that specific case is tested against. This indirectly. Implicitly, i know that this bson stuff should fix the mongo issue, but it doesn't explicitly test the example I provided.

I agree that we shouldn't be creating a bunch of test data in those shell scripts, but we can create a bson file with that data & include it in our testdata dir, and do a mongoimport. In our testdata, we have a script to generate iceberg data, then that data is checked into source control. We should do something similar with bson.

@tychoish tychoish enabled auto-merge (squash) January 30, 2024 02:05
@tychoish
Copy link
Contributor Author

I've written an additional test.

@tychoish tychoish merged commit 7dac252 into main Jan 30, 2024
21 checks passed
@tychoish tychoish deleted the tycho/fix-bson-null-handling branch January 30, 2024 15:59
tychoish added a commit that referenced this pull request Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid null handling in mongo collections
3 participants