Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port component storage to arrow-rs #8725

Merged
merged 41 commits into from
Jan 21, 2025
Merged

Port component storage to arrow-rs #8725

merged 41 commits into from
Jan 21, 2025

Conversation

emilk
Copy link
Member

@emilk emilk commented Jan 17, 2025

This makes our store run 100% on arrow-rs.

arrow2 is now relegated to the margins:

  • IPC
  • re_types_builder
  • Some legacy methods on Loggable, Archetype etc

@emilk emilk added 🏹 arrow concerning arrow 🚜 refactor Change the code, not the functionality exclude from changelog PRs with this won't show up in CHANGELOG.md labels Jan 17, 2025
Copy link

github-actions bot commented Jan 17, 2025

Web viewer built successfully. If applicable, you should also test it:

  • I have tested the web viewer
Result Commit Link Manifest
005ecc0 https://rerun.io/viewer/pr/8725 +nightly +main

Note: This comment is updated whenever you push a commit.

@emilk emilk force-pushed the emilk/arrow-re_chunk branch from 58f628c to e94a3b1 Compare January 17, 2025 14:45
---
ChunkStore {
id: test_id
config: ChunkStoreConfig { enable_changelog: true, chunk_max_bytes: 393216, chunk_max_rows: 4096, chunk_max_rows_if_unsorted: 1024 }
stats: {
num_chunks: 1
total_size_bytes: 1.3 KiB
total_size_bytes: 1.8 KiB
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed when we update arrow and can run .shrink_to_fit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, adding .shrink_to_fit to concat didn't help

Copy link

@emilk
Copy link
Member Author

emilk commented Jan 19, 2025

@rerun-bot full-check

Copy link

github-actions bot commented Jan 19, 2025

@emilk
Copy link
Member Author

emilk commented Jan 19, 2025

@rerun-bot full-check

Copy link

Started a full build: https://github.com/rerun-io/rerun/actions/runs/12855807735

@teh-cmc teh-cmc self-requested a review January 20, 2025 08:15
crates/store/re_dataframe/src/query.rs Show resolved Hide resolved
crates/store/re_log_types/src/arrow_msg.rs Outdated Show resolved Hide resolved
crates/utils/re_arrow_util/src/arrow_util.rs Outdated Show resolved Hide resolved
crates/utils/re_arrow_util/src/arrow_util.rs Show resolved Hide resolved
Comment on lines +267 to +269
show_labels: show_labels
.map(|b| !b.is_empty() && b.value(0))
.map(Into::into),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the way this was expressed before much more readable... is there no fallible get() method in arrow1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly not. All getters in arrow panics rather than return an Option.

@emilk
Copy link
Member Author

emilk commented Jan 20, 2025

@rerun-bot full-check

Copy link

github-actions bot commented Jan 20, 2025

@emilk
Copy link
Member Author

emilk commented Jan 20, 2025

I'm running all the benchmarks locally, comparing with 0.21.0, to see that we don't have any big regressions

@emilk emilk added the do-not-merge Do not merge this PR label Jan 20, 2025
@emilk
Copy link
Member Author

emilk commented Jan 21, 2025

Results of benchmarks

Comparing this PR with the 0.21.0 tag (dd025f1), notable changes (red is regression, green improvement):

  • Encoding log messages 🔻 20-30%. Likely unrelated to arrow, since we still use arrow2 for this
  • Decoding log message 🔻 15%
  • Decode message bundles 🌲 60%
  • Generate messages 🔻 25%
  • arrow_batch_strings2/query 🔻 10%
  • single_chunks/split_all 🌲 20%
  • vec_deque/remove_range/prefilled/middle 🌲 55% (maybe because of new std/compiler?)

So, some wins, some losses, but the losses are not huge.

@emilk emilk removed the do-not-merge Do not merge this PR label Jan 21, 2025
@emilk emilk merged commit 67f93c1 into main Jan 21, 2025
70 of 71 checks passed
@emilk emilk deleted the emilk/arrow-re_chunk branch January 21, 2025 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow exclude from changelog PRs with this won't show up in CHANGELOG.md 🚜 refactor Change the code, not the functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants