Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Z-order sorting option in Optimize operation #1127

Closed
wjones127 opened this issue Feb 5, 2023 · 5 comments · Fixed by #1429
Closed

Implement Z-order sorting option in Optimize operation #1127

wjones127 opened this issue Feb 5, 2023 · 5 comments · Fixed by #1429
Assignees
Labels
binding/rust Issues for the Rust crate enhancement New feature or request

Comments

@wjones127
Copy link
Collaborator

Description

Would like to be able to z-order tables. We probably need to implement a z-order sorting function in apache/arrow-rs first.

Use Case

Related Issue(s)

Probably should prioritize #1125 first.

@wjones127 wjones127 added enhancement New feature or request binding/rust Issues for the Rust crate labels Feb 5, 2023
@wjones127 wjones127 mentioned this issue Feb 5, 2023
21 tasks
@MrPowers
Copy link
Contributor

It'd be great if someone could grab this. Z ORDERing provides a "killer feature" functionality to end users. I think this functionality will get us a lot of users.

@wjones127
Copy link
Collaborator Author

Design doc for Z-order is here: https://docs.google.com/document/d/1TYFxAUvhtYqQ6IHAZXjliVuitA5D1u793PMnzsH_3vs/edit#heading=h.q7ah6pca24ul

Note that DataFusion has a spilling sort node, which can sort arbitrarily large inputs: apache/datafusion#1568. We should try to reuse that if possible.

However, we should still get optimize working in basic cases first.

@MrPowers
Copy link
Contributor

Here's the suggested interface for Z ORDER: delta_table.optimize.z_order(["col1", "col2"]).

Z Ordering allows users to fully enjoy the power of data skipping.

Even a naive implementation that simply sorts the data would be really useful for the data community. We could release a naive algorithm to start and then swap in a more fancy implementation later.

Even an implementation that simply sorts on one column would be a huge win right now.

@wjones127 wjones127 self-assigned this May 21, 2023
@wjones127
Copy link
Collaborator Author

TODO:

  • Implement z-order sort key function
  • Register function as a scalar UDF
  • Implement sorting query that sorts the data by the z-order sort key, then drops the key

z-order sort key function

  1. Map each column to a one or more 4-byte (32-bit) value. Variable width columns can map to a configurable number of int32s; for example strings we'll use the first 4 x N bytes of the string.
  2. Interleave the bits of the mapped values into a single binary array. That array is the z-order sort key.

@roeap
Copy link
Collaborator

roeap commented May 30, 2023

I may be completely off here, as I have not yet gone too deep into how the transformation in z-order curves actually work. at least on a high level though I do believe that the arrow-row package - at least conceptually - does something very similar, in that it reduces the multi-columnar representation to one that can be operated on (and sorted) as a single scalar-like value.

Main difference would be that it aims to sort lexicographically, rather then by some other means. Thus (and agian, haven't gone deep on that one) I wonder if we can take inspiration from the arrow row crate and "just" change the kernel that handles the reduction do one dimensional space?

https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-2/

wjones127 added a commit that referenced this issue Jun 6, 2023
# Description

Implements Z-order in Rust. This is a very basic version that requires
loading the whole partition into memory for sorting. In the future, we
can implement a DataFusion-based code path that allows sorting with
spilling to disk for even larger optimize jobs.

The Z-order function here is based on Arrow's row format. We truncate to
take the first 16 bytes of data (padding with zeros at end as
necessary). So for variable-width columns like strings, this means that
we are only using the first 15 bytes of the string (the first byte is
used to differentiate null and empty strings, [see row format
docs](https://docs.rs/arrow-row/40.0.0/arrow_row/struct.RowConverter.html#variable-length-bytes-including-strings-encoding)).
If a user has a string column where they all share the same prefix, this
z-order function won't work well for them. But in many common cases it
will work.

We'll also expose this in Python as a follow up.

# Related Issue(s)

- closes #1127


# Documentation

<!---
Share links to useful documentation
--->

---------

Co-authored-by: Robert Pack <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants