Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the BorTag GC to AllocIds #3103

Closed
wants to merge 14 commits into from
Closed

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Oct 4, 2023

As suggested in #3080 (comment)

We previously solved memory growth issues associated with the Stacked Borrows and Tree Borrows runtimes with a GC. But of course we also have state accumulation associated with whole allocations elsewhere in the interpreter, and this PR starts tackling those.

To do this, we expand the visitor for the GC so that it can visit a BorTag or an AllocId. Then when the GC runs, we use the reachable set of AllocIds to remove elements from the intptrcast state.

src/tag_gc.rs Outdated Show resolved Hide resolved
@saethlin saethlin marked this pull request as ready for review October 8, 2023 20:41
@saethlin saethlin added the S-waiting-on-review Status: Waiting for a review to complete label Oct 9, 2023
src/tag_gc.rs Outdated Show resolved Hide resolved
src/tag_gc.rs Outdated Show resolved Hide resolved
src/intptrcast.rs Outdated Show resolved Hide resolved
src/intptrcast.rs Outdated Show resolved Hide resolved
src/tag_gc.rs Outdated Show resolved Hide resolved
src/tag_gc.rs Outdated Show resolved Hide resolved
tests/pass/ptr_int_casts.rs Outdated Show resolved Hide resolved
@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Oct 9, 2023
bors added a commit that referenced this pull request Oct 19, 2023
intptrcast: remove information about dead allocations

The discussion in #3103 convinced me we don't have to keep `int_to_ptr_map` around for dead allocations. But we should not make that depend on the GC, we can just tie it to when the allocation gets freed. That means everything still behaves deterministically, if anything weird happens (but it shouldn't).

r? `@saethlin`
Only the first and last commit contain logic changes, the 2nd commit just moves code around a bit.
@bors
Copy link
Contributor

bors commented Oct 19, 2023

☔ The latest upstream changes (presumably #3122) made this pull request unmergeable. Please resolve the merge conflicts.

@saethlin
Copy link
Member Author

This isn't ready, I'm still thinking about your two comments I didn't mark as resolved.

RalfJung pushed a commit to RalfJung/rust that referenced this pull request Oct 21, 2023
intptrcast: remove information about dead allocations

The discussion in rust-lang/miri#3103 convinced me we don't have to keep `int_to_ptr_map` around for dead allocations. But we should not make that depend on the GC, we can just tie it to when the allocation gets freed. That means everything still behaves deterministically, if anything weird happens (but it shouldn't).

r? `@saethlin`
Only the first and last commit contain logic changes, the 2nd commit just moves code around a bit.
@saethlin saethlin added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Nov 11, 2023
@RalfJung
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Nov 12, 2023
@saethlin saethlin added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Nov 12, 2023
src/provenance_gc.rs Outdated Show resolved Hide resolved
src/provenance_gc.rs Outdated Show resolved Hide resolved
tests/utils/miri_extern.rs Outdated Show resolved Hide resolved
tests/utils/miri_extern.rs Outdated Show resolved Hide resolved
src/intptrcast.rs Outdated Show resolved Hide resolved
src/intptrcast.rs Outdated Show resolved Hide resolved
// live allocation IDs and all provenance in the allocation bytes, even if they are leaked.
// Here we exploit that `adjust_allocation` always returns `Owned`, to all
// tcx-managed allocations ever read or written will be copied in `alloc_map`.
self.memory.alloc_map().iter(|it| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have the secondary memory map in tcx. I guess we have to walk that one as well, judging from the test failures. The comment I suggested about not having to worry about tcx-managed allocation seems to be wrong.

In particular it is wrong for function and vtable allocations, which never get copied into the local alloc_map.

@saethlin saethlin added S-waiting-on-review Status: Waiting for a review to complete S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-author Status: Waiting for the PR author to address review comments S-waiting-on-review Status: Waiting for a review to complete labels Nov 16, 2023
@saethlin
Copy link
Member Author

CI has been running for an hour. That's not good. I'll look into it.

@saethlin
Copy link
Member Author

saethlin commented Nov 17, 2023

The slow part of the CI is inserting all the live AllocIds into the reachable set. Aside from asking the interpreter to track a FxHashSet<AllocId> for us then hand it to the GC via clone I'm not sure what else to do other than run the GC less often, but the CI is slow because we're running the GC super often. Maybe trim down the size of that test?

The cde that's causing the GC to lose track of a pointer is here, and it has a wild diagnostic:

error: Undefined Behavior: using alloc850+0xfffffffffffffc91 as function pointer but it does not point to a function
  --> /home/ben/.cargo/registry/src/index.crates.io-6f17d22bba15001f/getrandom-0.1.16/src/macos.rs:21:32
   |
LL |             let ret = unsafe { func(chunk.as_mut_ptr(), chunk.len()) };
   |                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ using alloc850+0xfffffffffffffc91 as function pointer but it does not point to a function

https://github.com/rust-random/getrandom/blob/bd14b7a243c7cda78f698742a84d1a0bbfcba972/src/macos.rs#L14-L21

type GetEntropyFn = unsafe extern "C" fn(*mut u8, libc::size_t) -> libc::c_int;

pub fn getrandom_inner(dest: &mut [u8]) -> Result<(), Error> {
    static GETENTROPY: Weak = unsafe { Weak::new("getentropy\0") };
    if let Some(fptr) = GETENTROPY.ptr() {
        let func: GetEntropyFn = unsafe { mem::transmute(fptr) };
        for chunk in dest.chunks_mut(256) {
            let ret = unsafe { func(chunk.as_mut_ptr(), chunk.len()) };

@RalfJung
Copy link
Member

RalfJung commented Nov 17, 2023

The slow part of the CI is inserting all the live AllocIds into the reachable set. Aside from asking the interpreter to track a FxHashSet for us then hand it to the GC via clone I'm not sure what else to do other than run the GC less often, but the CI is slow because we're running the GC super often. Maybe trim down the size of that test?

That test? We run all tests on Linux with "do GC after every step".

- name: Set the tag GC interval to 1 on linux
if: runner.os == 'Linux'
run: echo "MIRIFLAGS=-Zmiri-tag-gc=1" >> $GITHUB_ENV

And that seems to be good, we caught a bug?

For performance, here's one idea: we don't actually need the set. We need a function fn(AllocId) -> bool that tells us whether the AllocId is used anywhere. So instead of adding all live allocations to the set, we can make that function a 3-way disjunction, where it checks the global alloc_map, the local alloc_map, and the collected set. And then we don't need to put the live allocations into the set. That might help?

@RalfJung
Copy link
Member

The cde that's causing the GC to lose track of a pointer is here, and it has a wild diagnostic:

Ah, that's the old macOS getrandom doing strange integer pointer shenanigans. If this triggers a GC bug then the first thing to do is extract it into a dedicated test so that we don't have to rely on a pass-dep test to catch this. It should then also reproduce on all targets; the reason it's macOS-only is that getrandom uses different code for different targets.

@saethlin
Copy link
Member Author

Here's a minimization:

#![feature(rustc_private)]
extern crate libc;

extern "Rust" {
    pub fn miri_run_provenance_gc();
}

type GetEntropyFn = unsafe extern "C" fn(*mut u8, libc::size_t) -> libc::c_int;

fn main() {
    let name = "getentropy\0";
    let addr = unsafe { libc::dlsym(libc::RTLD_DEFAULT, name.as_ptr() as *const _) as usize };
    unsafe {
        miri_run_provenance_gc();
    }

    let ptr = addr as *mut libc::c_void;
    let func: GetEntropyFn = unsafe { std::mem::transmute(ptr) };
    let dest = &mut [0u8];
    unsafe { func(dest.as_mut_ptr(), dest.len()) };
}

Run with ./miri run -Zmiri-tag-gc=0 to get the error we get in CI.

@saethlin
Copy link
Member Author

saethlin commented Nov 17, 2023

Note that this code runs fine without explicitly invoking the provenance GC, because in the minimization there is no basic block end in the important region of the program. I always add code to run the GC on every machine step while debugging to avoid getting tripped up by this.

Lol that's not even right I forgot that the default is to run sparingly 🤦

@saethlin
Copy link
Member Author

@saethlin
Copy link
Member Author

I think this is going to be easier to do as a rustc PR, so that I can iterate on the interface between Miri and the interpreter core code. I'll set up that later today.

@RalfJung
Copy link
Member

RalfJung commented Nov 17, 2023

I think this is going to be easier to do as a rustc PR, so that I can iterate on the interface between Miri and the interpreter core code. I'll set up that later today.

However, rustc doesn't run all tests with -Zmiri-tag-gc=1.

(We should probably rename that flag in this PR as well...)

@RalfJung
Copy link
Member

Ah! I think the GC isn't picking up the contents of extra_fn_ptr_map: https://github.com/rust-lang/rust/blob/4d7f952a02d0bca67c98a6b74895b7e3fbe38341/compiler/rustc_const_eval/src/interpret/memory.rs#L106

Aha!

If we follow my proposal, the API we need is an efficient "is this AllocId live". So that's a single fn(&self, AllocId) -> bool, I think. Basically a more efficient version of get_alloc_info that doesn't compute the size. Maybe we just want that function to be split into two, one for size+align and one for AllocKind?

@saethlin
Copy link
Member Author

Subsumed by rust-lang/rust#118029

@saethlin saethlin closed this Nov 18, 2023
@saethlin saethlin deleted the allocid-gc branch May 27, 2024 22:10
spikespaz pushed a commit to spikespaz/dotwalk-rs that referenced this pull request Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: Waiting for the PR author to address review comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants