Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File::read_to_end's unexpected performance on Windows #110650

Closed
ducktherapy opened this issue Apr 21, 2023 · 2 comments · Fixed by #110655
Closed

File::read_to_end's unexpected performance on Windows #110650

ducktherapy opened this issue Apr 21, 2023 · 2 comments · Fixed by #110655
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. O-windows Operating system: Windows T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@ducktherapy
Copy link

I came across an unexpected performance degradation on Windows when using File::read_to_end(&mut self, buf: &mut Vec<u8>): the larger the passed in buf Vec's capacity, the longer the call takes—regardless of actually read bytes.

let mut buffer = Vec::with_capacity(BUFFER_SIZE);
let mut file = File::open("some_1kb_file.txt").expect("opening file");

let metadata = file.metadata().expect("reading metadata");
let len = metadata.len();
assert!(len == 1024);

file.read_to_end(&mut buffer).expect("reading file");

With the above code, increasing BUFFER_SIZE will linearly increase the runtime of read_to_end, even if we're always reading 1024 bytes.

This doesn't seem to happen on other OSes. At first I assumed I was doing something wrong, but with help from folks at StackOverflow we realized most of the time is spent in NtReadFile. One can get around this by using file.read_exact(&mut buffer[0..len]) or file.take(len).read_to_end() instead.

I don't know what the implication of querying for the file size and using it in read_to_end would have, or if it'd be possible to get around this another way, but I assume at the very worst we could have a warning in the documentation about this.

Meta

rustc 1.68.2 (9eb3afe9e 2023-03-27)
binary: rustc
commit-hash: 9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0
commit-date: 2023-03-27
host: x86_64-pc-windows-msvc
release: 1.68.2
LLVM version: 15.0.6
@ducktherapy ducktherapy added the C-bug Category: This is a bug. label Apr 21, 2023
@bstrie bstrie added O-windows Operating system: Windows T-libs Relevant to the library team, which will review and decide on the PR/issue. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Apr 21, 2023
@joboet
Copy link
Member

joboet commented Apr 21, 2023

As Reddit user John T. Erickson pointed out, this is also reproducible with C, so the issue probably lies with Windows. Still, maybe a clever workaround on Rust's side would be preferable?

@ChrisDenton
Copy link
Member

ChrisDenton commented Apr 21, 2023

Hm, it probably doesn't have to be that clever. The internal io::default_read_to_end could take an (optional) size hint that limits the maximum size of reads. The File methods already attempt to get a size in order to reserve buffer space (e.g. here).

@bors bors closed this as completed in 9de7d91 Apr 23, 2023
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Sep 22, 2024
…ChrisDenton

delay uncapping the max_read_size in File::read_to_end

In rust-lang#130600 (comment) I realized that we're likely still passing too-large buffers to the OS, at least once at the end.

Previous issues and PRs:
* rust-lang#110650
* rust-lang#110655
* rust-lang#118222

r? ChrisDenton
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Sep 22, 2024
…ChrisDenton

delay uncapping the max_read_size in File::read_to_end

In rust-lang#130600 (comment) I realized that we're likely still passing too-large buffers to the OS, at least once at the end.

Previous issues and PRs:
* rust-lang#110650
* rust-lang#110655
* rust-lang#118222

r? ChrisDenton
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Sep 22, 2024
…ChrisDenton

delay uncapping the max_read_size in File::read_to_end

In rust-lang#130600 (comment) I realized that we're likely still passing too-large buffers to the OS, at least once at the end.

Previous issues and PRs:
* rust-lang#110650
* rust-lang#110655
* rust-lang#118222

r? ChrisDenton
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Sep 22, 2024
Rollup merge of rust-lang#130670 - the8472:read-to-end-heuristics, r=ChrisDenton

delay uncapping the max_read_size in File::read_to_end

In rust-lang#130600 (comment) I realized that we're likely still passing too-large buffers to the OS, at least once at the end.

Previous issues and PRs:
* rust-lang#110650
* rust-lang#110655
* rust-lang#118222

r? ChrisDenton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. O-windows Operating system: Windows T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants