-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for the legacy compressions (shrink/reduce/implode). #120
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea behind this PR, but the implementation isn't idiomatic Rust and wouldn't be very inefficient. I may have more changes to request once these are done.
src/legacy/shrink.rs
Outdated
struct Codetab { | ||
prefix_code: u16, // INVALID_CODE means the entry is invalid. | ||
ext_byte: u8, | ||
len: u16, | ||
last_dst_pos: usize, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct Codetab { | |
prefix_code: u16, // INVALID_CODE means the entry is invalid. | |
ext_byte: u8, | |
len: u16, | |
last_dst_pos: usize, | |
} | |
#[repr(packed)] | |
struct Codetab { | |
last_dst_pos: usize, | |
prefix_code: Option<u16>, | |
ext_byte: u8, | |
len: u16 | |
} |
This should fit into 128 bits on 64-bit platforms and 96 bits on 32-bit platforms, all with no penalty for an unaligned access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust doesn't like that - I get errors. Why repr(packed) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It prevents any alignment padding. Usually scalar values are aligned on their size, and arrays on their element size.
Wow - thanks for the code review :). Will look into it. So I tried to go through the comments. Maybe I've forgotton some - there were many. For short: I've looked into bitstream-io a bit. I'm not sure if it's worth switching to it. It just pulls in another dependency for just ~60 LOC or so. A different thing is the huffman implementation IMO. I'll try to search for a huffman crate that can be used. bitstream-io contains some huffman stuff as well so it may be worth using - but haven't looked really into it. |
e35d857
to
a2f1236
Compare
src/legacy/shrink.rs
Outdated
struct Codetab { | ||
prefix_code: u16, // INVALID_CODE means the entry is invalid. | ||
ext_byte: u8, | ||
len: u16, | ||
last_dst_pos: usize, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It prevents any alignment padding. Usually scalar values are aligned on their size, and arrays on their element size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, hwzip author here :)
This is great to see! I'm only a Rust newbie, so probably can't help much with the review, but am happy to answer any questions about the original code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to get through over half of the PR this time!
src/legacy/huffman.rs
Outdated
self.sentinel_bits[len] = s; | ||
debug_assert!(self.sentinel_bits[len] >= code[len] as u32, "No overflow!"); | ||
sym_idx[len] = sym_idx[len - 1] + count[len - 1]; | ||
self.offset_first_sym_idx[len] = sym_idx[len].wrapping_sub(code[len]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this wrapped around, would it indicate a bug? If so, use -
instead of wrapping_sub
(behaves the same in release builds but is checked for overflow in debug builds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know for sure at least unit tests fail.
ImplodeDecoder { | ||
compressed_reader: inner, | ||
uncompressed_size, | ||
stream_read: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually not the custom Read impl that decides whether or not to chunk -- those who want chunking will wrap this in a BufReader, and it's best to be prepared for that eventuality even if it's unlikely.
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
This reverts commit fafdaf1.
Signed-off-by: Chris Hennick <[email protected]>
Signed-off-by: Chris Hennick <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
This reverts commit 5d3654e.
Co-authored-by: Chris Hennick <[email protected]> Signed-off-by: Mike Krüger <[email protected]>
|
Signed-off-by: Chris Hennick <[email protected]>
Signed-off-by: Chris Hennick <[email protected]>
Signed-off-by: Chris Hennick <[email protected]>
This PR probably needs to have its rebase manually adjusted due to #210. |
For a project that will encounter old .zip files I need a rust library that supports shrink/reduce and implode.
I didn't find any so I ported the hw-zip 2.2 implementations to rust: https://www.hanshq.net/zip.html