-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify BitReader (~5-10% faster) #2381
Conversation
parquet/src/util/bit_util.rs
Outdated
} | ||
|
||
true | ||
self.skip(1, num_bits) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually called by anything I can find, so lets just keep it simple
} | ||
|
||
// TODO: better to avoid copying here | ||
Some(from_ne_slice(v.as_bytes())) | ||
} | ||
|
||
/// Skip one value of size `num_bits`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed as no longer needed, was only ever called by the implementation of BitReader::skip
which now does something simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parquet/src/util/bit_util.rs
Outdated
let bytes_to_read = cmp::min(self.total_bytes - self.byte_offset, 8); | ||
/// Populates `self.buffered_values` | ||
#[inline] | ||
fn load_buffer(&mut self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given there is a field called buffer
and this function is not loading it, but rather reading from it and loading into buffered_values
I recommend calling this function something slightly verbose:
fn load_buffer(&mut self) { | |
/// Loads up to the the next 8 bytes from `self.buffer` at `self.byte_offfset` | |
/// into `self.buffered_values`. Reads fewer than 8 bytes if there are fewer than 8 bytes left | |
fn load_buffered_values(&mut self) { |
@@ -429,7 +406,7 @@ impl BitReader { | |||
|
|||
let mut values_to_read = batch.len(); | |||
let needed_bits = num_bits * values_to_read; | |||
let remaining_bits = (self.total_bytes - self.byte_offset) * 8 - self.bit_offset; | |||
let remaining_bits = (self.buffer.len() - self.byte_offset) * 8 - self.bit_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this PR is any better or worse, but what ensures that self.bit_offset (which can be up to 63) is always less than self.buffer.len() - self.byte_offset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic on the next line, and similar variants of it. We only read a number of bits based on what remains
parquet/src/util/bit_util.rs
Outdated
return None; | ||
} | ||
|
||
// Only need to populate buffer if byte aligned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment -- maybe some more context would help about why the buffer needs to be loaded if there are no more bits left in the currrent byte
self.reload_buffer_values(); | ||
v |= trailing_bits(self.buffered_values, self.bit_offset) | ||
.wrapping_shl((num_bits - self.bit_offset) as u32); | ||
if self.bit_offset != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic that decides when to reload self.buffered_values
is confusing to me as I would normally expect calling self.load_buffer()
would also reset bit_offset
to 0 to match having just reloaded more bits into self.buffered_values
Benchmark runs are scheduled for baseline = 76e79d9 and contender = 47a2c21. 47a2c21 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #.
Rationale for this change
This makes a couple of changes:
What changes are included in this PR?
total_bytes
making it easier for LLVM to elide bounds checksThe second of these is particularly impactful for DeltaBitPackedDecoder, where the miniblocks are in chunks of 32 and therefore never actually fall out of byte alignment.
Are there any user-facing changes?
No