Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation changes to BufWriter #78551

Closed
wants to merge 15 commits into from

Conversation

Lucretiel
Copy link
Contributor

@Lucretiel Lucretiel commented Oct 30, 2020

This PR contains some proposed implementation updates to BufWriter, with a focus on reducing total device write calls in the average case. These updates are based on lessons learned from the design of LineWriterShim, as well as some discussions with @workingjubilee on this topic.

There are three main changes to how BufWriter works:

  • write_all now fully fills the buffer before flushing. Previously, it would fill the buffer as much as possible without splitting any incoming writes. However, we assume that a caller of write_all is foremost interested in minimizing write calls to the underlying device, which we can best achieve by maximizing the size of buffers we flush. Of course, incoming data that exceeds the total size of the buffer is still forwarded directly to the underlying device.
    • Conversely, we assume that a caller of write would prefer to have unbroken writes if possible (that is, would prefer Ok(n) where n == buf.len()). The implementation of write is therefore left unchanged with regard to buffer filling.
  • write_vectored is unchanged when inner.is_write_vectored(). However, in the case where the underlying device doesn't offer any specialization, we now take care to buffer together all the incoming sub-bufs, even if the total size exceeds our buffer, and only fall back to directly forwarded writes in the case that an individual slice exceeds our buffer size. As a result, BufWriter now also unconditionally returns true for is_write_vectored.
  • Added flush_buf_vectored(), a new internal method that is similar to flush_buf, but additionally attempts to use vectored operations to send the new incoming data along with the existing buffered data in a single operation. I took care to ensure that this never results in more system calls than it would have already; in particular, it immediately stops forwarding as soon as the existing buffer is fully forwarded, even if 0 new bytes have been sent. write and write_all were refactored to take advantage of it, and write_vectored was slightly modified to forward to write if exactly 1 buffer is being written.
  • Additionally, as a much more minor change, removed several unnecessary uses of the self.panicked guard fences. BufWriter::panicked is about preventing duplicate writes of buffered data, so it isn't necessary to fence writes to the underlying device when the buffer is known to be empty.

Follow up items

Tasks

  • Test the new behavior

Open questions

Stuff I want to call out and ensure is addressed in review:

  • Do the assumptions underpinning the BufWriter changes make sense.
  • Would it make sense for BufWriter to unconditionally return true for is_write_buffered, since it specializes vectored writes by buffering them together, even when the underlying device offers no such specialization?

This was split off to a separate PR from #78515

Changes to some BufWriter Write methods, with a focus on reducing total
device write calls by fully filling the buffer in some cases
@rust-highfive
Copy link
Collaborator

r? @joshtriplett

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 30, 2020
@Lucretiel Lucretiel mentioned this pull request Oct 30, 2020
9 tasks
@camelid camelid added T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 30, 2020
library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
@m-ou-se
Copy link
Member

m-ou-se commented Oct 30, 2020

Do the assumptions underpinning the BufWriter changes make sense.

I'm slightly worried about splitting things from write_all when the data could fit in the buffer if it was flushed first.

The documentaion here says:

This method will continuously call write until there is no more data to be written or an error of non-ErrorKind::Interrupted kind is returned.

Strictly speaking it doesn't specify that it will call write on the whole buffer, but I think that's what it implies. It's not a weird assumption to make that a write_all() to a BufWriter with less than .capacity() bytes will be not be split. It might be fine to break this assumption, but right now there's no way to avoid it with write_all, as BufWriter does not expose anything like a write_out_the_internal_buffer() function that makes room for the next write_all() without also calling flush() on the underlying buffer.


Would it make sense for BufWriter to unconditionally return true for is_write_buffered, since it specializes vectored writes by buffering them together, even when the underlying device offers no such specialization?

The documentation says:

Determines if this Writeer has an efficient write_vectored implementation.

If a Writeer does not override the default write_vectored implementation, code using it may want to avoid the method all together and coalesce writes into a single buffer for higher performance.

So I'd say yes, this write_vectored implementation qualifies. You'd not want a user of BufWriter to manually buffer things to call write instead of write_vectored.

- BufWriter now makes use of vectored writes when flushing; it attempt to
write both buffered data and incoming data in a single operation when it
makes sense to do so.
- LineWriterShim takes advantage of BufWriter's new "vectored flush"
operation in a few places
- Fixed a failing test. No new tests yet.
- Added BufWriter::available; used it to simplify various checks
- Refactored write to make it more simple; extensively recommented it
- Fixed bugs in write implementation; decompressed it to make the flow
more clear.
- Replaced several uses of .capacity() with .available(); in these cases
they're identical (because they always occur after a completed flush)
but available makes more sense conceptually.
@Lucretiel
Copy link
Contributor Author

I'm slightly worried about splitting things from write_all when the data could fit in the buffer if it was flushed first.

Yeah, I went back on forth on this. The rationale I settled on is that, while write tries as much as is reasonable to not split the incoming buffer, a caller of write_all is probably more interested in all the data being processed as efficiently as possible, since it's known that write_all will loop as much as it needs to to ingest all the given data.

So I'd say yes, this write_vectored implementation qualifies. You'd not want a user of BufWriter to manually buffer things to call write instead of write_vectored.

Done!

@Lucretiel
Copy link
Contributor Author

I added a section to the PR description, but I wanted to call it out in a new comment: I revised the PR so that write and write_all now take advantage of vectored writes when possible. Basically they'll pair the buffer and the incoming bytes into a vectored write when flushing.

Comment on lines +455 to +459
if let Some(buf) = only_one(bufs, |b| !b.is_empty()) {
// If there's exactly 1 incoming buffer, `Self::write` can make
// use of self.inner.write_vectored to attempt to combine flushing
// the existing buffer with writing the new one.
self.write(buf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this special case is worth optimizing for. write_vectored is used when there are multiple slices to write, and all the other paths handle occasionally empty slices correctly. The added branching here adds overhead to the most typical case of multiple non-empty slices.

Copy link
Contributor Author

@Lucretiel Lucretiel Nov 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'll see if I can whip up a benchmark; my instinct is that branch prediction means that this overhead will be completely negligible (since, as you said, most of the time there'll be several inputs.)

The major point of differentiation between write and write_vectored here doesn't have anything to do with empty slices; it's that write gets to use flush_buf_vectored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth a lot on this, but I'm coming around to getting rid of it since, yeah, callers will likely end up calling the correct method. I've reviewed my own use of write_vectored in this PR (flush_buf_vectored etc) and confirmed that this usually ends up being the case. I'd love to get another opinion though (@m-ou-se?)

library/std/src/io/buffered/bufwriter.rs Show resolved Hide resolved
@Dylan-DPC-zz
Copy link

@Lucretiel any updates on this (i know it is waiting on review but if you can address the last review before we get this reviewed by @joshtriplett it would be better)

@Lucretiel
Copy link
Contributor Author

Yeah, my apologies, I've been busy with other stuff and honestly the election kind of took over all of my free cognitive space. I can follow up with this at least by this weekend.

@Lucretiel
Copy link
Contributor Author

Strictly speaking it doesn't specify that it will call write on the whole buffer, but I think that's what it implies. It's not a weird assumption to make that a write_all() to a BufWriter with less than .capacity() bytes will be not be split. It might be fine to break this assumption, but right now there's no way to avoid it with write_all, as BufWriter does not expose anything like a write_out_the_internal_buffer() function that makes room for the next write_all() without also calling flush() on the underlying buffer.

The main reason I'm willing to break them up in this case is that write_all has always expressed to me "do whatever it takes to write the entire buffer". In particular, because write_all freely retries writes, it makes sense to me that a caller would be fine with the writes being split if the underlying system does so.

Additionally, by far the most common caller of write_all is going to be write! and friends, and in that case I'd assume the caller is especially uninterested in non-split writes, since it's going to end up doing several tiny writes anyway.

@Lucretiel
Copy link
Contributor Author

Found a corner-case bug in this implementation, please don't merge until I update & write a regression test

- Found and fixed a bug where write_vectored could, in some
circumstances, forward a write directly to the inner writer (skipping
the buffer) without first flushing the buffer.
- Added a regression test for this bug.
@Lucretiel
Copy link
Contributor Author

Fixed

@bors
Copy link
Contributor

bors commented Dec 9, 2020

☔ The latest upstream changes (presumably #78768) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

@tgnottingham
Copy link
Contributor

I'm working on some changes that may significantly improve BufWriter performance, incidentally as part of using BufWriter more heavily in the compiler. The changes are simple, but may be going in a different direction from this PR. They may be reconcilable though. I'll try to review and comment when I have a chance.

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 11, 2021
@Dylan-DPC-zz
Copy link

@Lucretiel any updates?

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 8, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 1, 2021
@Dylan-DPC-zz
Copy link

thanks for taking the time to contribute. I have to close this due to inactivity. If you wish and you have the time you can open a new PR with these changes and we'll take it from there. Thanks

@Dylan-DPC-zz Dylan-DPC-zz added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.