Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large performance regression (compared to 0.21) for missing query cache #8668

Closed
Wumpf opened this issue Jan 13, 2025 · 4 comments · Fixed by #8684 or rerun-io/re_arrow2#19
Closed
Assignees
Labels
🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release

Comments

@Wumpf
Copy link
Member

Wumpf commented Jan 13, 2025

This test scene alien_cake_addict.rrd.zip plays (!) now a lot slower.

On my Mac Rerun reports 90ms on 0.21 and 270ms cpu time on 92949cf.
When paused it it went from 50ms to 65ms (also bad, but not as crazy).

From a first glimpse it seems that the cost of not hitting the query cache has gone up considerably, but more investigation is needed.
This may be temporary due to the ongoing arrow/arrow2 conversion. (shipblocking in any case!).

@Wumpf Wumpf added 🚀 performance Optimization, memory use, etc 🪳 bug Something isn't working labels Jan 13, 2025
@Wumpf Wumpf added this to the 0.22 - ? milestone Jan 13, 2025
@Wumpf Wumpf changed the title Large performance regression for missing query cache Large performance regression (compared to 0.21) for missing query cache Jan 13, 2025
@Wumpf Wumpf added the 🦟 regression A thing that used to work in an earlier release label Jan 13, 2025
@emilk
Copy link
Member

emilk commented Jan 13, 2025

Since this is a scene with a huge number of objects, I suspect the added overhead of arrow1<->arrow2 conversions is to blame. The conversions are always zero-copy, but usually still involve a few small allocations and pointer-chasing, so they are by no means free.

@emilk emilk self-assigned this Jan 14, 2025
@emilk
Copy link
Member

emilk commented Jan 14, 2025

fn row_sliced is being called almost 9k times per frame, and is around 10µs slower than it was before (per call). I'm investigating why.

row_ids and time columns have both been switched to arrow1, but initial investigation indicates this is not what is slower. It could maybe be related to tagging (ComponentDescriptor)?

@emilk
Copy link
Member

emilk commented Jan 14, 2025

This is super-weird. It is this piece of code that has become slower:

https://github.com/rerun-io/rerun/blob/0.21.0/crates/store/re_chunk/src/slice.rs#L110-L118

The things is - the code hasn’t changed:

https://github.com/rerun-io/rerun/blob/main/crates/store/re_chunk/src/slice.rs#L109-L117

Nor has the types surrounding it. Maybe the new Rustc optimizes worse? I think I need to git bisect 😭

@emilk
Copy link
Member

emilk commented Jan 14, 2025

This is the PR that regressed performance:

But why?

Specifically this commit:

I wonder if the arrow2 -> arrow -> arrow2 conversions adds a null-array or something… 🤔

emilk added a commit that referenced this issue Jan 14, 2025
I'm not 100% sure why this happenes.
It looks like roundtripping `ListArray` marks the inner datatype field
as nullable, even when it wasn't originally.
I'm taking a look at fixing this in `re_arrow2`, but I wanted
to open this quick-fix in the meantime.

* Closes #8668
* Introduced in PR #8617
* Introduced in commit 78f676f
emilk added a commit to rerun-io/re_arrow2 that referenced this issue Jan 14, 2025
* Hoping to fix rerun-io/rerun#8668, but no
luck so far
emilk added a commit that referenced this issue Jan 16, 2025
* Part of #3741 
* [x] Tested that it does not regress
#8668

This makes `TransportChunk` a wrapper around an arrow `RecordBatch`.

### Future work
* Remove `TransportChunk` and replace it with an extension trait on
`RecordBatch`
* Simplify the dataframe API to always return a full `RecordBatch`
(adding a schema to the rows is basically free now)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 🚀 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release
Projects
None yet
2 participants