Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add dynamic borrow checking for dereferencing NumPy arrays. #274

Merged
merged 1 commit into from
Mar 21, 2022

Conversation

adamreichold
Copy link
Member

@adamreichold adamreichold commented Feb 12, 2022

This PR tries to follow up on the position taken by the cxx crate: C++ or Python code is unsafe and therefore needs to ensure that relevant invariants are upheld. Safe Rust code can then rely on this for e.g. aliasing discipline as long as it does not introduce any memory safety violations itself.

Hence, this PR adds dynamic borrow checking which ensures that safe Rust code will uphold aliasing discipline as long as the Python does so as well. This does come with the cost of two accesses to a global hash table per dereference which is surely not negligible, but also constant, i.e. it does not increase with the number of array elements. It can also be avoided by calling the unsafe/unchecked variants of the accessors when it really is a performance issue.

While I think this PR solves #258 in a manner that is as safely as possible when interacting with unchecked language like Python, I would open another issue immediately after merging this into main to track refining the over-approximation of considering all views into the same base object as conflict to detect non-overlap and interleaving. But I prefer like to do this as separate PR to keep this reviewable.

Closes #258

src/borrow.rs Outdated
D: Dimension,
{
pub(crate) fn try_new(array: &'a PyArray<T, D>) -> Option<Self> {
let address = array as *const PyArray<T, D> as usize;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written in #258, it might actually make sense to follow the "base object chain" here and use the address of the final base object as the key into the hash table which would be able to safely detect quite a few aliasing issues on the Python side of the fence as well.

But it would also be an over-approximation that could sometimes fail unexpectedly if it does not consider e.g. splitting a large array into multiple non-overlapping slices which can then be borrowed separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like to find out if there are projects out there which break if the over-approximation, i.e. distinct borrows must have distinct root base objects, is taken. Of course, I would like to find out without releasing that code into the wild and waiting for the angry complaints. :-(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(There is of course another issue where this might not work properly, considering that base objects are not necessarily NumPy arrays themselves (e.g. our PySliceContainer), there is probably no real guarantee that distinct base objects mean non-overlapping backing memory, e.g. memory map base objects could be distinct while pointing into the same address range.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's practical for us to calculate the start and end addresses of the memory pointed at by the view? If we used a binary-search based structure to store all the these (start, end) tuples we may be able to detect overlaps reasonably efficiently.

(I would hope that the cost of doing these aliasing checks will work out as much cheaper than large array operations!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sufficient though? Isn't possible that those ranges overlap even for non-overlapping array views due to non-unit strides? For example, have a large arrays with three RGB layers which gets sliced into three separate views where all have basically the same start-to-end range but never overlap?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I would hope that non-unit strides are the exception (so might be able to optimise for the common case). They may turn out to be very common though, e.g. 1-D slices of multidimensional arrays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear that the general case may be an instance of satisfiability and therefore NP: For two arrays A_1/2 with the same base object and data pointers p_1/2 and dimensionalities D_1,2 and dimensions d_{1/2,i} and strides s_{1/2,i}, we need to determine whether the equation

p_1 + s_1,1 * k_1 + ... + s_{1,D_1} k_{D_1} = p_2 +  s_2,1 * l_1 + ... + s_{2,D_2} * l_{D_2}

has integer solutions in the domain

k_i in {0, ..., d_{1,i} - 1} for i in {1, ..., D_1}
l_i in {0, ..., d_{2,i} - 1} for i in {1, ..., D_2}

Not sure whether there is any structure in this problem which makes this substantially easier...

@adamreichold
Copy link
Member Author

adamreichold commented Feb 12, 2022

This PR tries to follow up on the position taken by the cxx crate: C++ and Python code is unsafe and therefore needs to ensure that relevant invariants are upheld. Safe Rust code can then rely on this for e.g. aliasing discipline as long as it does not introduce any memory safety violations itself.

One thing that might make this more palatable is that I think that we basically are already in this position, at least w.r.t. thread safety: Python code can send references to arrays to other threads and there is nothing we can do to prevent it, i.e. we already assume that the Python code is written correctly. (Actually, touching the NPY_ARRAY_WRITEABLE without atomics on possibly multiple threads introduces a data race where there was none if those arrays are accessed for reading only.)

@davidhewitt
Copy link
Member

Thanks for this. To help get my head around this and confirm that we agree: what's the expectation on well-behaved Python code? I think it's that Python code should never write to an array while Rust holds a borrow to it? Are there other constraints too?

@adamreichold
Copy link
Member Author

Thanks for this. To help get my head around this and confirm that we agree: what's the expectation on well-behaved Python code? I think it's that Python code should never write to an array while Rust holds a borrow to it? Are there other constraints too?

I think this is basically it, but it is complicated by NumPy views being arrays themselves, e.g. a function implemented in Rust taking x for reading and y for writing should not be given two views which overlap in their backing array as this would mean that the ArrayView and ArrayViewMut we derive from those NumPy views would overlap. (Note that is is already a problem from which NPY_ARRAY_WRITEABLE does not protect us as as_array_mut does (and cannot if called first) care.)

Also note, that just having aliasing views lying around is not an issue AFAIU due the Rust compiler never seeing those and hence being unable to make any incorrect assumptions about aliasing for them.

The above problem could theoretically even be solved by us by following NumPy slices through to their base objects and including their offsets/dimensions in the dynamic borrow checking - i.e. #274 (comment) - and if we are able to implement this with reasonable efficiency we would really be down to "do not write/read arrays which Rust code has currently borrowed/mutably borrowed".

@adamreichold
Copy link
Member Author

Also note that this would imply that we need to remove all accessors providing &T into &mut T into the array, or at least putting them behind reference wrappers. (For example, as_slice is probably better served by the same method on the ArrayView yielded by as_array.)

@davidhewitt
Copy link
Member

If the accessors are unsafe, it may be fine to leave them? It would be desirable to avoid too many breaking changes for users. I'm wondering if there's some way we can leave the existing interface (mostly) untouched and add this dynamic borrow checking as an opt-in safer interface.

@adamreichold
Copy link
Member Author

If the accessors are unsafe, it may be fine to leave them? It would be desirable to avoid too many breaking changes for users. I'm wondering if there's some way we can leave the existing interface (mostly) untouched and add this dynamic borrow checking as an opt-in safer interface.

Yes, that's true. I just made a quick survey and besides as_cell_slice which we already dropped, I think all methods that create references into the array's interior are already unsafe. (Their documentation however does not always call out the aliasing issue though, but if we do this then a significant documentation overhaul is required in any case.)

One change that would still be required is that safe cloning methods like get_owned, to_owned_array, to_vec would need to take a short-lived borrow internally. This could be considered semantic breakage as they did not have this exclusion property before, but it would not stop code from compiling.

@adamreichold
Copy link
Member Author

adamreichold commented Feb 16, 2022

(Looking at the scheduling, I think that it is not sensible to target this to version 0.16 even if we decide to go for it. I rather think having that a quick rust-numpy release following PyO3 0.16 containing the accumulated maintenance work is preferable. Personally, I would also like to wait for @kngwyu's perspective before putting more work into this branch.)

@adamreichold
Copy link
Member Author

Thinking about this some more, I think that going for Deref-based transparent wrappers is probably not the best idea after all. We could probably have this as a separate object which exposes a safe as_array method just like PyReadonlyArray. I am actually leaning towards naming the two types PyReadonlyArray and PyReadwriteArray to stay consistent with the existing terminology and keep the breakage to a minimum. The only change for users would then be to use PyReadwriteArray in their method definitions instead of the unsafe as_array_mut. (Of course, the documentation around these types would still need to change significantly...)

@adamreichold adamreichold force-pushed the array-ref branch 2 times, most recently from 542b587 to 67f8600 Compare February 18, 2022 19:09
src/borrow.rs Outdated
Entry::Occupied(entry) => {
let readers = entry.into_mut();

let new_readers = readers.wrapping_add(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can overflow (in other words, I can't imagine any bug that can cause overflow here because it requires at least u32::MAX times operation), but if it happens, I think it's OK to panic! here

Copy link
Member Author

@adamreichold adamreichold Feb 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a checked_add and panic but that would only complicate the code path as the wrapping_add and the check for <= 0, i.e. either it was -1 and there is already a writer or it overflowed and there are isize::MAX readers already handles both cases without additional control flow or panics with the same result that we can have only up to isize::MAX readers. (The code is also not really my idea, this is basically std's BorrowFlag implementation copied over here.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kngwyu Are you satisfied with the explanation?

@kngwyu
Copy link
Member

kngwyu commented Feb 20, 2022

Sorry for the delay (I was really overwhelmed by a deadline), and thank you for this patch a lot.
Overall, I think this approach is convincing for preventing us from having multiple mutable references of the same array object, but wouldn't it mean we don't need to change the NPY_ARRAY_WRITABLE flag?

@kngwyu
Copy link
Member

kngwyu commented Feb 20, 2022

I agree that changing some NumPy array flags like WRITEABLE or OWNDATA really is not the way to ensure safety, but I feel like it still helps users. Happy to hear some other opinions @aldanor
I mean, the point is whether to do some flags operation even if it does not offer the strict safety guarantee.
Also, about implementation, could we use a redundant bit of NumPy flag?

@adamreichold
Copy link
Member Author

I also thought about still touching the NPY_ARRAY_WRITEABLE flag if purely as a debugging aid, especially as the dynamic borrow checking does not catch all possible mistakes but still relies on a contract with the Python that needs to be spelled out in the documentation. (We could eventually catch a lot of aliasing mistakes by following the base object chain but run into fundamental issues of opaque base objects there. But for example, we should be able to detect all overlapping NumPy views.)

For now, I would prefer to not touch the flags as this abuses their semantics somewhat (for example, NPY_ARRAY_WRITEABLE seems to interact with NPY_ARRAY_WRITEBACKIFCOPY and without the flag, an array loses NPY_ARRAY_BEHAVED) and the flag is not enforceable outside of NumPy's own access routines (e.g. we do not consider it at all) so I fear that involving the flags would only further complicate an already complicated situation without providing a lot of benefit (e.g. it does not even catch all Python-side interference because there can be writeable views of the same data we just marked as read-only).

Also, about implementation, could we use a redundant bit of NumPy flag?

I think we need substantially more than a single bit to store a potentially large number of readers. This together with the brittleness of using another library's still evolving data structures seem to argue against this IMHO.

@adamreichold
Copy link
Member Author

(Actually, touching the NPY_ARRAY_WRITEABLE without atomics on possibly multiple threads introduces a data race where there was none if those arrays are accessed for reading only.)

I just noticed that this is not correct: The flag itself would always be modified while holding the GIL and only afterwards would allow_threads be used.

This has the nice side effect of making the dynamic borrow checking here better: Having a &'py PyArray<T,D> is proof of holding the GIL and hence the thread_local can be avoided and we can have global dynamic borrow checking.

@kngwyu
Copy link
Member

kngwyu commented Feb 21, 2022

so I fear that involving the flags would only further complicate an already complicated situation without providing a lot of benefit

I understand and agree to keep things minimal here. Having a different API for manipulating flags might be better, just in case users want to explicitly do it.

@adamreichold
Copy link
Member Author

Considering how to proceed: I suggest that after releasing 0.16, we try to get this into shape w.r.t. documentation and tests, but keeping the over-approximation (i.e. borrows work on the level of the base address) and refine that in follow-up PR, maybe first checking start and end of the memory ranges for overlap as @davidhewitt suggested and then fully considering the strides to allow interleaving but non-overlapping views, hopefully in time for 0.17.

@adamreichold
Copy link
Member Author

adamreichold commented Mar 16, 2022

@davidhewitt @kngwyu I think this is ready for review now after adding tests and documentation. I also updated the cover letter of the PR including the plan to close #258 by this and track refining the conflict computations in a separate issue.

@adamreichold
Copy link
Member Author

Because I felt bad for deprecating the npyiter module without providing data to back up my claims, I added at least an example benchmark for the difference between using NumPy's iterators via our wrapper and ndarray's iterators to zip together three arrays for setting one to the element-wise sum of the others.

The results seem to support the decision to deprecate the module:

test ndarray_iter_large  ... bench:       8,676 ns/iter (+/- 204)
test ndarray_iter_medium ... bench:         146 ns/iter (+/- 3)
test ndarray_iter_small  ... bench:          40 ns/iter (+/- 0)
test numpy_iter_large    ... bench:     121,361 ns/iter (+/- 5,506)
test numpy_iter_medium   ... bench:       3,647 ns/iter (+/- 182)
test numpy_iter_small    ... bench:         267 ns/iter (+/- 10)

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overall I'm satisfied that this is probably the best approach for now. The unsafe primitives continue to exist for users who have more esoteric arrays with non-unit strides etc.

One thing to think about which I don't think I've seen yet in this discussion, what's the stated interaction of this with the GIL?

E.g. if we borrow one of these arrays and then release the Python GIL, are other Python threads allowed to mutate the array before our Rust thread then continues? This may have potential further issues around aliasing.

src/borrow.rs Outdated Show resolved Hide resolved
src/borrow.rs Outdated Show resolved Hide resolved
src/error.rs Outdated Show resolved Hide resolved
@adamreichold
Copy link
Member Author

adamreichold commented Mar 17, 2022

The unsafe primitives continue to exist for users who have more esoteric arrays with non-unit strides etc.

Existing code using PyReadonlyArray and as_array_mut should not be affected at all as shared borrowed do not conflict. I will also start working on refining the checking for the overlapping case as soon this is in shape to be merged.

One thing to think about which I don't think I've seen yet in this discussion, what's the stated interaction of this with the GIL?

While the global hash table is protected by the GIL, the borrows are not bound to, i.e. they will stay active even if the GIL is released, c.f. the test case

#[test]
fn borrows_span_threads() {
    Python::with_gil(|py| {
        let array = PyArray::<f64, _>::zeros(py, (1, 2, 3), false);

        let _exclusive = array.readwrite();

        let array = array.to_owned();

        py.allow_threads(move || {
            let thread = spawn(move || {
                Python::with_gil(|py| {
                    let array = array.as_ref(py);

                    let _exclusive = array.readwrite();
                });
            });

            assert!(thread.join().is_err());
        });
    });
}

I will add a comment on this to the module-level documentation.

E.g. if we borrow one of these arrays and then release the Python GIL, are other Python threads allowed to mutate the array before our Rust thread then continues? This may have potential further issues around aliasing.

Unchecked code - Python, unsafe Rust, C, etc. - running on other threads is able to mutate those arrays. But there is nothing we can do to stop unchecked doing this and it would be a data race whether our code is written in Rust or something else entirely, i.e. this is incorrect even without the additional requirements Rust places on references.

I tried to explain this by

//! We can also not prevent unchecked code from concurrently modify an array via callbacks or using multiple threads,
//! but that would lead to incorrect results even if the code that is interfered with is implemented in another language
//! which does not require aliasing discipline.

in the module-level documentation. Do you think this should be extended and/or improved?

@adamreichold adamreichold force-pushed the array-ref branch 2 times, most recently from b28f255 to fca2979 Compare March 17, 2022 21:33
Copy link
Member

@kngwyu kngwyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this push forward!

I had just one comment on the document. I'll double-check examples if I have some time.

I added at least an example benchmark for the difference between using NumPy's iterators via our wrapper and ndarray's iterators to zip together three arrays for setting one to the element-wise sum of the others.

Yeah, I think the decision is reasonable but isn't it the scope of this PR?

src/borrow.rs Outdated
@@ -0,0 +1,579 @@
//! Types to safely create references into the interior of the NumPy arrays
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is well written and I enjoyed reading it. However, I fear this can be a bit overwhelming for beginners. So, I propose to re-organize the document in a way like:

## Summary
The position we take, and the resulting design.
## Some background/reasoning about the design
The reason why this design is useful/reasonable is considering the Rust/Python boundary.
## Current limitation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed an updated version where I tried to restructure the module-level documentation along the proposed lines.

@adamreichold
Copy link
Member Author

adamreichold commented Mar 18, 2022

Yeah, I think the decision is reasonable but isn't it the scope of this PR?

Indeed. Since it really just adds the two benchmarks, I will take the liberty of cherry-picking the commit onto main and remove it here. (I did it here because I touched the npyiter module anyway.)

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely - I think we're pretty much there now. Thank you for working on this (and the follow on), I think this is an exciting step for rust-numpy!

src/borrow.rs Outdated
//! which does not require aliasing discipline.
//!
//! Concerning multi-threading in particular: While the GIL needs to be acquired to create borrows, they are not bound to the GIL
//! and will stay active after the GIL is released, for example by calling [`allow_threads`][pyo3::Python::allow_threads].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that many Python API calls will eventually lead to GIL release too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I mainly added that example to have something to link to where the reader can follow up on GIL-related questions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I mainly added that example to have something to link to where the reader can follow up on GIL-related questions.

@@ -0,0 +1,586 @@
//! Types to safely create references into NumPy arrays
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is filled with excellent detail 👍

If I could suggest one final improvement, I think that we can rearrange the order of content just so that users who want to understand what the module does get that information presented to them at the top.

I think it's as simple as adding a short paragraph after this one liner just listing the uses (e.g. obtaining ndarray views, iteration etc.) and moving the Examples section above Rationale.

Copy link
Member Author

@adamreichold adamreichold Mar 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to implement your suggestions, please have another look.

This PR is getting a bit overwhelming w.r.t. the amount of comments accumulated, but I think there is also the still-open question at the end of #274 (comment) to resolve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I think we should proceed to merge! I think the doc is great now and I'm sure it will evolve further as users ask questions and the overlap resolution is implemented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is surely only the start of handling dynamic borrow checking of arrays and external users will surely bring an more expansive perspective on this. I think that at least, there should be no regressions for existing usage patterns involving PyReadonlyArray and the unsafe variant of as_array_mut.

src/borrow.rs Outdated Show resolved Hide resolved
@kngwyu
Copy link
Member

kngwyu commented Mar 22, 2022

Sorry I was on a short trip. The refined and reordered document looks really nice to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Breaking PyReadonlyArray
3 participants