Add Buffered<T> to add buffering to byte streams. #3405

asynts · 2020-09-05T17:39:38Z

This is more of a collection of loosely related stuff.

I added a Buffered<T> template that adds buffering to any class that inherits from InputStream or OutputStream.
I've tried to get the Gzip implementation to a state where it can correctly decompress usual files (as opposed to files constructed to test one specific behavior.) I already found and fixed a few bugs but there seems to be a bug buried in DeflateDecompressor::decode_distance still.
I also accumulated a few unit tests that I wrote to track down errors.

It started to become annoying to rebase so I thought I'd merge some of it.

We shouldn't assert that the input file is valid.

…-of-file.

I suspected an error in CircularDuplexStream::read(Bytes, size_t). This does not appear to be the case, this test case is useful regardless. The following script was used to generate the test: import gzip uncompressed = [] for _ in range(0x100): uncompressed.append(1) for _ in range(0x7e00): uncompressed.append(0) for _ in range(0x100): uncompressed.append(1) compressed = gzip.compress(bytes(uncompressed)) compressed = ", ".join(f"0x{byte:02x}" for byte in compressed) print(f"""\ TEST_CASE(gzip_decompress_repeat_around_buffer) {{ const u8 compressed[] = {{ {compressed} }}; u8 uncompressed[0x8011]; Bytes{{ uncompressed, sizeof(uncompressed) }}.fill(0); uncompressed[0x8000] = 1; const auto decompressed = Compress::GzipDecompressor::decompress_all({{ compressed, sizeof(compressed) }}); EXPECT(compare({{ uncompressed, sizeof(uncompressed) }}, decompressed.bytes())); }} """, end="")

The streaming operator doesn't short-circuit, consider the following snippet: void foo(InputStream& stream) { int a, b; stream >> a >> b; } If the first read fails, the second is called regardless. It should be well defined what happens in this case: nothing.

Symbols that need <= 8 bits hit a fast path as of SerenityOS#18075, but the slow path has done a full binary search over all symbols ever since this code was added in SerenityOS#2963. (SerenityOS#3405 even added a FIXME for doing this, but SerenityOS#18075 removed it.) Instead of doing a binary search over all codes for every single bit read, this implements the Moffat-Turpin approach described at https://www.hanshq.net/zip.html#huffdec, which only requires a table read per bit. hyperfine 'Build/lagom/bin/unzip ~/Downloads/enwik8.zip' 1.008 s ± 0.016 s => 957.7 ms ± 3.9 ms, 5% faster Due to issue SerenityOS#25005, we can't peek the full 15 bits at once but have to read them one-by-one. This makes the code look a bit different than in the linked article. I also tried not changing CanonicalCode::from_bytes() too much. It does 16 passes over all symbols. I think it could do it in a single pass instead. But that's for a future change. No behavior change (other than slightly faster perf).

Symbols that need <= 8 bits hit a fast path as of SerenityOS#18075, but the slow path has done a full binary search over all symbols ever since this code was added in SerenityOS#2963. (SerenityOS#3405 even added a FIXME for doing this, but SerenityOS#18075 removed it.) Instead of doing a binary search over all codes for every single bit read, this implements the Moffat-Turpin approach described at https://www.hanshq.net/zip.html#huffdec, which only requires a table read per bit. hyperfine 'Build/lagom/bin/unzip ~/Downloads/enwik8.zip' 1.008 s ± 0.016 s => 957.7 ms ± 3.9 ms, 5% faster Due to issue SerenityOS#25005, we can't peek the full 15 bits at once but have to read them one-by-one. This makes the code look a bit different than in the linked article. I also tried not changing CanonicalCode::from_bytes() too much. It does 15 passes over all symbols. I think it could do it in a single pass instead. But that's for a future change. No behavior change (other than slightly faster perf).

Symbols that need <= 8 bits hit a fast path as of #18075, but the slow path has done a full binary search over all symbols ever since this code was added in #2963. (#3405 even added a FIXME for doing this, but #18075 removed it.) Instead of doing a binary search over all codes for every single bit read, this implements the Moffat-Turpin approach described at https://www.hanshq.net/zip.html#huffdec, which only requires a table read per bit. hyperfine 'Build/lagom/bin/unzip ~/Downloads/enwik8.zip' 1.008 s ± 0.016 s => 957.7 ms ± 3.9 ms, 5% faster Due to issue #25005, we can't peek the full 15 bits at once but have to read them one-by-one. This makes the code look a bit different than in the linked article. I also tried not changing CanonicalCode::from_bytes() too much. It does 15 passes over all symbols. I think it could do it in a single pass instead. But that's for a future change. No behavior change (other than slightly faster perf).

asynts added 9 commits September 5, 2020 18:04

LibCompress: Replace ASSERT_NOT_REACHED with set_fatal_error.

88c3dff

We shouldn't assert that the input file is valid.

AK: Add log stream operator overload for Span.

0c5394e

AK: Add Buffered<T> which wraps a stream, adding input buffering.

ad4d5f7

Userland: Use Buffered<T> in gunzip.

54db1c2

LibCore: FileStream.h: Fix infinite loop when trying to read past end…

8f46940

…-of-file.

Deflate: Fix deadly typo.

a9ab75d

LibCompress: Simplify logic in deflate implementation.

5269df4

awesomekling merged commit 4c317a9 into SerenityOS:master Sep 6, 2020

asynts deleted the buffered branch September 6, 2020 11:12

nico mentioned this pull request Sep 9, 2024

LibCompress: Speed up CanonicalCode::read_symbol() slow path #25008

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Buffered<T> to add buffering to byte streams. #3405

Add Buffered<T> to add buffering to byte streams. #3405

asynts commented Sep 5, 2020

Add Buffered<T> to add buffering to byte streams. #3405

Add Buffered<T> to add buffering to byte streams. #3405

Conversation

asynts commented Sep 5, 2020