Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BufReader should provide a way to seek without dumping the buffer #31100

Closed
taralx opened this issue Jan 22, 2016 · 37 comments · Fixed by #82992
Closed

BufReader should provide a way to seek without dumping the buffer #31100

taralx opened this issue Jan 22, 2016 · 37 comments · Fixed by #82992
Labels
C-feature-accepted Category: A feature request that has been accepted pending implementation. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@taralx
Copy link
Contributor

taralx commented Jan 22, 2016

BufReader's seek() implementation always dumps the buffer. This is not good for performance, where users may want to skip a variable amount of buffered data -- only BufReader really knows if it's reasonable to move the pointer or not.

Additionally, only BufReader can know if the pointer can be reversed or not -- so there's absolutely no cheap way to "unget" data from BufReader, even if you know it's seekable.

I'd recommend removing this behavior and moving it to unwrap() or another method (sync?), but it's now baked into a stable API.

What are the options now?

@taralx
Copy link
Contributor Author

taralx commented Jan 22, 2016

(I'm in the process of writing my own BufReader implementation that doesn't do this.)

@neokril
Copy link

neokril commented Jan 27, 2016

Currently BufReader saves only the offset within buffered data.
As I understand to keep buffer during seek it needs to save "global offset" within inner Reader and update it in all read/seek operations.
It needs this "global offset" because seek operation should return new position in underlying Reader which is not currently saved anywhere.

Another variant is to make a separate function (e.g. buffered_seek) which will use only buffered data, will move only relative to current position and return true/false (true if it was able to move current position inside the buffer; false - nothing happened and user needs to make a separate call to "seek" to actually move the pointer).

@taralx
Copy link
Contributor Author

taralx commented Jan 27, 2016

I'm not trying to avoid seeking the underlying stream, I'm trying to avoid dumping the buffer. In my implementation, small seeks that fit in the existing buffer cause a call to self.inner.seek(SeekFrom::Current(0)).

@taralx taralx changed the title BufReader's seek implementation is somewhat irritating for performance BufReader should provide a way to seek without dumping the buffer Jan 27, 2016
@neokril
Copy link

neokril commented Jan 28, 2016

@taralx Could you please share your implementation?

You use self.inner.seek(SeekFrom::Current(0)) to get current file offset. Am I correct?

@taralx
Copy link
Contributor Author

taralx commented Jan 28, 2016

Not without getting source approval from my employer, sorry.

Yes, I implement pub fn pos(&mut self) -> io::Result<u64> and then in seek I have snippets like:

let p = try!(self.pos());
self.pos -= back;
return p - back;

@gcarq
Copy link

gcarq commented Oct 22, 2016

Some time ago I had a similiar requirement, I extracted the implementation into a own library: seek_bufread, maybe its useful for you.

@steveklabnik steveklabnik added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed A-libs labels Mar 24, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Jul 24, 2017
@lolgesten
Copy link

I stumbled upon this today. When decoding mp4 movies, the decoder will do small seeks forward to skip in the region of 100 bytes at a time. I used a BufReader with 65k capacity to avoid my underlying reader doing http requests for every seek, only to find that BufReader isn't helping at all.

I think BufReader could be more useful if it worked from the buffer also when seek fits.

@dtolnay dtolnay added C-feature-accepted Category: A feature request that has been accepted pending implementation. and removed C-feature-request Category: A feature request, i.e: not implemented / a PR. labels Nov 18, 2017
@dtolnay
Copy link
Member

dtolnay commented Nov 18, 2017

I would like to see a PR that implements seek(SeekFrom::Current(n)) without dumping the buffer if the offset n is within the buffered data. I may be missing something but I don't think BufReader would need to store a global offset in order to make this work.

bors added a commit that referenced this issue Jan 14, 2018
BufRead: Only flush the internal buffer if seeking outside of it.

Fixes #31100

r? @dtolnay
@taralx
Copy link
Contributor Author

taralx commented Jan 14, 2018

FWIW, this solution requires R:Seek. A try_seek_relative function would not have such a requirement.

@SimonSapin
Copy link
Contributor

SimonSapin commented Mar 16, 2018

This should not be closed, the new method is still unstable. It’s been a couple months though:

@rfcbot fcp merge

@SimonSapin SimonSapin reopened this Mar 16, 2018
@alexcrichton
Copy link
Member

@SimonSapin to confirm, this is for the seek_relative method, right?

@SimonSapin
Copy link
Contributor

Yes, that is the one item with issue = "31100".

@SimonSapin
Copy link
Contributor

Hmm, maybe I needed to reopen the issue before making the command:

@rfcbot fcp merge

@rfcbot rfcbot added the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Mar 17, 2018
@rfcbot
Copy link

rfcbot commented Mar 17, 2018

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Mar 18, 2018
@rfcbot
Copy link

rfcbot commented Mar 18, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Mar 18, 2018
@rfcbot
Copy link

rfcbot commented Mar 28, 2018

The final comment period is now complete.

@taralx
Copy link
Contributor Author

taralx commented Apr 20, 2019

No, the major issue is one of compatibility guarantees. Right now, if you want to sync the BufReader's position with the underlying Reader and extract it, you seek and then into_inner. So that pattern needs to continue to work.

@funchal
Copy link

funchal commented Apr 20, 2019

Can't we just leave seek alone then and add a separate api seek2 or whatever which has the little enhancement of not losing the buffer when you seek within it?

@sfackler
Copy link
Member

That already happened: https://doc.rust-lang.org/std/io/struct.BufReader.html#method.seek_relative

@taralx
Copy link
Contributor Author

taralx commented Apr 20, 2019

I've been advocating for try_seek_relative because the existing method requires R: Seek where that's not necessary for a BufReader.

@czipperz
Copy link
Contributor

@taralx do you want to write up a pr? I doubt it would be very difficult. Just refactor seek_relatives code that already does this out into the method?

@czipperz
Copy link
Contributor

I do find it mildly ironic that the original implementation explicitly documented that it would always flush the buffer and why.

@taralx
Copy link
Contributor Author

taralx commented May 26, 2019

Life is a little busy right now, so if someone else is excited to do it, go for it. Otherwise I'll see if I can't find some time to put together a PR.

@fintelia
Copy link
Contributor

fintelia commented Nov 18, 2019

@SimonSapin it seems that there was a final comment period for this in Mar 2018 that passed with no objections. Does that now mean there can be a stabilization PR?

@t-rapp
Copy link
Contributor

t-rapp commented Nov 18, 2019

At least for my use-case of parsing media files I'm still relying on the seek_bufread crate instead of using seek_relative because there seems to be no way in stdlib BufReader to get the current position without dumping the buffer.

@maxburke
Copy link

maxburke commented Feb 5, 2020

One of the challenges I have been finding with BufReader is that the stream_position convenience method is implemented with seek(SeekFrom::Current(0)) which dumps the buffer.

Is it possible to have this case specialized so that getting the current position doesn't cause the buffer to be thrown away?

@JohnTitor JohnTitor added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Feb 28, 2020
@JohnTitor JohnTitor removed the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Mar 18, 2020
@alecmocatta
Copy link
Contributor

I'm not entirely clear on the status of BufReader::seek_relative. It was implemented, this issue was denoted as the tracking issue, there's was a successful fcp, but it's still unstable? What was the fcp for?

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 7, 2020
…rtodt

Implement Seek::stream_position() for BufReader

Optimization over `BufReader::seek()` for getting the current position without flushing the internal buffer.

Related to rust-lang#31100. Based on the code in rust-lang#70577.
@Diggsey
Copy link
Contributor

Diggsey commented Oct 25, 2020

@alecmocatta someone still needs to open a PR to mark the method as stable.

@elomatreb
Copy link
Contributor

As someone unfamiliar with the processes of Rust development, is there a chance of getting this into one of the upcoming releases?

@workingjubilee
Copy link
Member

The relevant PR passed FCP, has been approved to merge, and this should become stable in 1.53.0 if all goes well.

@bors bors closed this as completed in d4d7ebf Apr 13, 2021
TaKO8Ki added a commit to TaKO8Ki/rust that referenced this issue Nov 18, 2023
…k-Simulacrum

Add Seek::seek_relative

The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (rust-lang#31100).

Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.

_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
TaKO8Ki added a commit to TaKO8Ki/rust that referenced this issue Nov 18, 2023
…k-Simulacrum

Add Seek::seek_relative

The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (rust-lang#31100).

Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.

_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
TaKO8Ki added a commit to TaKO8Ki/rust that referenced this issue Nov 18, 2023
…k-Simulacrum

Add Seek::seek_relative

The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (rust-lang#31100).

Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.

_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
compiler-errors added a commit to compiler-errors/rust that referenced this issue Nov 18, 2023
…k-Simulacrum

Add Seek::seek_relative

The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (rust-lang#31100).

Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.

_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Nov 18, 2023
Rollup merge of rust-lang#116750 - fintelia:seek_seek_relative, r=Mark-Simulacrum

Add Seek::seek_relative

The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (rust-lang#31100).

Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.

_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-accepted Category: A feature request that has been accepted pending implementation. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.