Deframer implementation resizes buffer when not necessary #211

bugreport1234 · 2019-01-08T18:05:21Z

Hello,
I'm using this library in a project. First of all, thanks, this library has made working with TLS very straightforward for me.

I'm using this library in the context of non-blocking I/O. Due to the specifics of my project I'm unable to use an OS polling library like mio, and instead manually check sockets for readiness in a loop, sleeping for a short period between iterations. This means that the vast majority of read calls return WouldBlock. This has worked well with unencrypted TcpStreams, however once I implemented an encrypted version with this library I started seeing unreasonable CPU usage even though the connection was idle.

I've been able to modify the simpleclient.rs example to serve as a minimal testcase: https://pastebin.com/S0SPraDN. Executing this code results in consistent CPU usage of ~2% for me.

Upon investigation, it appears the code in deframer.rs#L45 first resizes the deframer buffer to maximum length, attempts a read, and truncates the buffer back to its original size if an error occurred. It seems these operations were the cause of the CPU usage I was seeing, possibly because of the allocation size of >18KB. Unfortunately these operations execute in almost every iteration of my loop, due to the returned WouldBlock from the underlying socket.

I've been able to confirm that these operations are indeed the cause of the CPU usage through a workaround. If you insert the following at the start of the read function the CPU usage drops back to 0%:

let mut empty_buf = [0; 0];
rd.read(&mut empty_buf)?;

The code attempts to read to an empty buffer which means it shouldn't affect the functionality of the success case. In the error case, the function immediately returns without getting to the resizing part, thus avoiding the costly operations.

However I feel like it should be possible to solve this without a workaround like this one. From PR #9 and PR #11 I gather that the current behavior and the large buffer size were introduced to avoid too many read calls to make reading more efficient. Perhaps std::io::BufReader could be used here, which seems to have been made to solve this problem, and appears not to be using resize internally.
However, I have no experience with either TLS nor BufReader internals, so I apologize if this would not be a workable solution.

The text was updated successfully, but these errors were encountered:

ctz · 2019-04-21T17:00:33Z

Thanks for the report and repro program. I think BufReader would work here, but only once rust-lang/rust/pull/49139 is landed. In the mean time, I'll keep the vec max length and track the valid prefix here.

ctz · 2019-04-22T20:35:16Z

before:

read_tls with EWOULDBLOCK                                                                             
                        time:   [1.0132 us 1.0180 us 1.0228 us]

after:

read_tls with EWOULDBLOCK                                                                             
                        time:   [10.609 ns 10.668 ns 10.729 ns]
                        change: [-98.966% -98.955% -98.943%] (p = 0.00 < 0.05)
                        Performance has improved.

ctz closed this as completed Apr 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deframer implementation resizes buffer when not necessary #211

Deframer implementation resizes buffer when not necessary #211

bugreport1234 commented Jan 8, 2019

ctz commented Apr 21, 2019

ctz commented Apr 22, 2019

Deframer implementation resizes buffer when not necessary #211

Deframer implementation resizes buffer when not necessary #211

Comments

bugreport1234 commented Jan 8, 2019

ctz commented Apr 21, 2019

ctz commented Apr 22, 2019