Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some more docs #2256

Merged
merged 3 commits into from
Mar 30, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 7 additions & 8 deletions src/instance.rs
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,6 @@ pub unsafe trait PyNativeType: Sized {
/// [`Py::clone_ref`] will be faster if you happen to be already holding the GIL.
///
/// ```rust
/// use pyo3::conversion::AsPyPointer;
/// use pyo3::prelude::*;
/// use pyo3::types::PyDict;
///
Expand All @@ -186,9 +185,9 @@ pub unsafe trait PyNativeType: Sized {
/// drop(first);
///
/// // They all point to the same object
/// assert_eq!(second.as_ptr(), third.as_ptr());
/// assert_eq!(fourth.as_ptr(), fifth.as_ptr());
/// assert_eq!(second.as_ptr(), fourth.as_ptr());
/// assert!(second.is(&third));
/// assert!(fourth.is(&fifth));
/// assert!(second.is(&fourth));
/// });
/// # }
/// ```
Expand All @@ -197,8 +196,8 @@ pub unsafe trait PyNativeType: Sized {
///
/// It is easy to accidentally create reference cycles using [`Py`]`<T>`.
/// The Python interpreter can break these reference cycles within pyclasses if they
/// implement the [`PyGCProtocol`](crate::class::gc::PyGCProtocol). If your pyclass
/// contains other Python objects you should implement this protocol to avoid leaking memory.
/// [integrate with the garbage collector][gc]. If your pyclass contains other Python
/// objects you should implement it to avoid leaking memory.
///
/// # A note on Python reference counts
///
Expand All @@ -220,6 +219,7 @@ pub unsafe trait PyNativeType: Sized {
///
/// [`Rc`]: std::rc::Rc
/// [`RefCell`]: std::cell::RefCell
/// [gc]: https://pyo3.rs/main/class/protocols.html#garbage-collector-integration
#[repr(transparent)]
pub struct Py<T>(NonNull<ffi::PyObject>, PhantomData<T>);

Expand Down Expand Up @@ -487,7 +487,6 @@ impl<T> Py<T> {
/// # Examples
///
/// ```rust
/// use pyo3::conversion::AsPyPointer;
/// use pyo3::prelude::*;
/// use pyo3::types::PyDict;
///
Expand All @@ -497,7 +496,7 @@ impl<T> Py<T> {
/// let second = Py::clone_ref(&first, py);
///
/// // Both point to the same object
/// assert_eq!(first.as_ptr(), second.as_ptr());
/// assert!(first.is(&second));
/// });
/// # }
/// ```
Expand Down
39 changes: 23 additions & 16 deletions src/marker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -806,22 +806,6 @@ impl<'py> Python<'py> {
err::error_on_minusone(self, v)
}

/// Retrieves a Python instance under the assumption that the GIL is already
/// acquired at this point, and stays acquired for the lifetime `'py`.
///
/// Because the output lifetime `'py` is not connected to any input parameter,
/// care must be taken that the compiler infers an appropriate lifetime for `'py`
/// when calling this function.
///
/// # Safety
///
/// The lifetime `'py` must be shorter than the period you *assume* that you have GIL.
/// I.e., `Python<'static>` is always *really* unsafe.
#[inline]
pub unsafe fn assume_gil_acquired() -> Python<'py> {
Python(PhantomData)
}

/// Create a new pool for managing PyO3's owned references.
///
/// When this `GILPool` is dropped, all PyO3 owned references created after this `GILPool` will
Expand Down Expand Up @@ -882,6 +866,29 @@ impl<'py> Python<'py> {
}
}

impl<'unbound> Python<'unbound> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a liking to properly naming lifetimes as of recently. What would you think of renaming the Python<'py> lifetime to Python<'gil>? It's still short and I think it represents the meaning of the lifetime better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the sentiment, but I am not sure if the improvement is worth the churn?

Copy link
Member

@davidhewitt davidhewitt Mar 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a fairly mechanical change if we wanted to update. If we were consistent with always having 'gil for the name of the GIL lifetime when present, that would also help make it clear when things did not depend on the GIL.

/// Unsafely creates a Python token with an unbounded lifetime.
///
/// Many of PyO3 APIs use `Python<'_>` as proof that the GIL is held, but this function can be
/// used to call them unsafely.
///
/// # Safety
///
/// - This token and any borrowed Python references derived from it can only be safely used
/// whilst the currently executing thread is actually holding the GIL.
Copy link
Member Author

@mejrs mejrs Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: threads, I was thinking about this:

fn foo(py: Python<'_>){
    let handle = std::thread::spawn(|| unsafe {
        let py = Python::assume_gil_acquired();
        // call some pyo3 apis
    });
    
    handle.join().unwrap();
}

Is this legal? Should we declare that it is not allowed?

Copy link
Member

@adamreichold adamreichold Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As spawn and join are synchronization points, I think this is legal as long as there is absolutely nothing happening in between spawn and join which could prevent the join from happening, i.e. the thread must be scoped.

However, I would suggest not including this example in the documentation as it opens more questions than it answers IMHO and I think the current wording is the right thing to do in almost all of the cases. When someone has to handle a case where this is not so (as in your example), they should be sure enough to figure it out. Everybody else is probably just confused by watering down the statement about the current thread.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the way I worded it would declare that it's not allowed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the above is legal - holding the GIL on the Python side also sets a bunch of thread-local variables about the interpreter state. Not holding the GIL in the exact thread the Python API is called in will likely lead to segfaults by reading invalid values from these variables.

/// - This function creates a token with an *unbounded* lifetime. Safe code can assume that
/// holding a `Python<'py>` token means the GIL is and stays acquired for the lifetime `'py`.
/// If you let it or borrowed Python references escape to safe code you are
/// responsible for bounding the lifetime `'unbound` appropriately. For more on unbounded
/// lifetimes, see the [nomicon].
///
/// [nomicon]: https://doc.rust-lang.org/nomicon/unbounded-lifetimes.html
#[inline]
pub unsafe fn assume_gil_acquired() -> Python<'unbound> {
Python(PhantomData)
}
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
10 changes: 4 additions & 6 deletions src/pycell.rs
Original file line number Diff line number Diff line change
Expand Up @@ -107,12 +107,11 @@
//! fn swap_numbers(a: &mut Number, b: &mut Number) {
//! std::mem::swap(&mut a.inner, &mut b.inner);
//! }
//! # use pyo3::AsPyPointer;
//! # fn main() {
//! # Python::with_gil(|py|{
//! # let n = Py::new(py, Number{inner: 35}).unwrap();
//! # let n2 = n.clone_ref(py);
//! # assert_eq!(n.as_ptr(), n2.as_ptr());
//! # assert!(n.is(&n2));
//! # let fun = pyo3::wrap_pyfunction!(swap_numbers, py).unwrap();
//! # fun.call1((n, n2)).expect_err("Managed to create overlapping mutable references. Note: this is undefined behaviour.");
//! # });
Expand All @@ -138,19 +137,18 @@
//! #[pyfunction]
//! fn swap_numbers(a: &PyCell<Number>, b: &PyCell<Number>) {
//! // Check that the pointers are unequal
//! if a.as_ptr() != b.as_ptr() {
//! if !a.is(b) {
//! std::mem::swap(&mut a.borrow_mut().inner, &mut b.borrow_mut().inner);
//! } else {
//! // Do nothing - they are the same object, so don't need swapping.
//! }
//! }
//! # use pyo3::AsPyPointer;
//! # fn main() {
//! # // With duplicate numbers
//! # Python::with_gil(|py|{
//! # let n = Py::new(py, Number{inner: 35}).unwrap();
//! # let n2 = n.clone_ref(py);
//! # assert_eq!(n.as_ptr(), n2.as_ptr());
//! # assert!(n.is(&n2));
//! # let fun = pyo3::wrap_pyfunction!(swap_numbers, py).unwrap();
//! # fun.call1((n, n2)).unwrap();
//! # });
Expand All @@ -159,7 +157,7 @@
//! # Python::with_gil(|py|{
//! # let n = Py::new(py, Number{inner: 35}).unwrap();
//! # let n2 = Py::new(py, Number{inner: 42}).unwrap();
//! # assert_ne!(n.as_ptr(), n2.as_ptr());
//! # assert!(!n.is(&n2));
//! # let fun = pyo3::wrap_pyfunction!(swap_numbers, py).unwrap();
//! # fun.call1((&n, &n2)).unwrap();
//! # let n: u32 = n.borrow(py).inner;
Expand Down
119 changes: 96 additions & 23 deletions src/types/bytearray.rs
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ impl PyByteArray {
}
}

/// Creates a new Python bytearray object from another PyObject that
/// Creates a new Python `bytearray` object from another Python object that
/// implements the buffer protocol.
pub fn from<'p, I>(py: Python<'p>, src: &'p I) -> PyResult<&'p PyByteArray>
where
Expand All @@ -84,46 +84,119 @@ impl PyByteArray {
self.len() == 0
}

/// Get the start of the buffer containing the contents of the bytearray.
/// Gets the start of the buffer containing the contents of the bytearray.
///
/// Note that this bytearray object is both shared and mutable, and the backing buffer may be
/// reallocated if the bytearray is resized. This can occur from Python code as well as from
/// Rust via [PyByteArray::resize].
/// # Safety
///
/// As a result, the returned pointer should be dereferenced only if since calling this method
/// no Python code has executed, [PyByteArray::resize] has not been called.
/// See the safety requirements of [`PyByteArray::as_bytes`] and [`PyByteArray::as_bytes_mut`].
pub fn data(&self) -> *mut u8 {
unsafe { ffi::PyByteArray_AsString(self.as_ptr()) as *mut u8 }
}

/// Get the contents of this buffer as a slice.
/// Extracts a slice of the `ByteArray`'s entire buffer.
///
/// # Safety
/// This bytearray must not be resized or edited while holding the slice.
///
/// ## Safety Detail
/// This method is equivalent to `std::slice::from_raw_parts(self.data(), self.len())`, and so
/// all the safety notes of `std::slice::from_raw_parts` apply here.
/// Mutation of the `bytearray` invalidates the slice. If it is used afterwards, the behavior is
/// undefined.
///
/// These mutations may occur in Python code as well as from Rust:
/// - Calling methods like [`PyByteArray::as_bytes_mut`] and [`PyByteArray::resize`] will
/// invalidate the slice.
/// - Actions like dropping objects or raising exceptions can invoke `__del__`methods or signal
/// handlers, which may execute arbitrary Python code. This means that if Python code has a
/// reference to the `bytearray` you cannot safely use the vast majority of PyO3's API whilst
/// using the slice.
///
/// As a result, this slice should only be used for short-lived operations without executing any
/// Python code, such as copying into a Vec.
///
/// # Examples
///
/// ```rust
/// use pyo3::prelude::*;
/// use pyo3::exceptions::PyRuntimeError;
/// use pyo3::types::PyByteArray;
///
/// #[pyfunction]
/// fn a_valid_function(bytes: &PyByteArray) -> PyResult<()> {
/// let section = {
/// // SAFETY: We promise to not let the interpreter regain control
/// // or invoke any PyO3 APIs while using the slice.
/// let slice = unsafe { bytes.as_bytes() };
///
/// // Copy only a section of `bytes` while avoiding
/// // `to_vec` which copies the entire thing.
/// let section = slice
/// .get(6..11)
/// .ok_or_else(|| PyRuntimeError::new_err("input is not long enough"))?;
/// Vec::from(section)
/// };
///
/// // Now we can do things with `section` and call PyO3 APIs again.
/// // ...
/// # assert_eq!(&section, b"world");
///
/// Ok(())
/// }
/// # fn main() -> PyResult<()> {
/// # Python::with_gil(|py| -> PyResult<()> {
/// # let fun = wrap_pyfunction!(a_valid_function, py)?;
/// # let locals = pyo3::types::PyDict::new(py);
/// # locals.set_item("a_valid_function", fun)?;
/// #
/// # py.run(
/// # r#"b = bytearray(b"hello world")
/// # a_valid_function(b)
/// #
/// # try:
/// # a_valid_function(bytearray())
/// # except RuntimeError as e:
/// # assert str(e) == 'input is not long enough'"#,
/// # None,
/// # Some(locals),
/// # )?;
/// #
/// # Ok(())
/// # })
/// # }
/// ```
///
/// # Incorrect usage
///
/// In particular, note that this bytearray object is both shared and mutable, and the backing
/// buffer may be reallocated if the bytearray is resized. Mutations can occur from Python
/// code as well as from Rust, via [PyByteArray::as_bytes_mut] and [PyByteArray::resize].
/// The following `bug` function is unsound ⚠️
///
/// Extreme care should be exercised when using this slice, as the Rust compiler will
/// make optimizations based on the assumption the contents of this slice cannot change. This
/// can easily lead to undefined behavior.
/// ```rust
/// # use pyo3::prelude::*;
/// # use pyo3::types::PyByteArray;
///
/// #[pyfunction]
/// fn bug(py: Python<'_>, bytes: &PyByteArray) {
/// let slice = unsafe { bytes.as_bytes() };
///
/// As a result, this slice should only be used for short-lived operations to read this
/// bytearray without executing any Python code, such as copying into a Vec.
/// // This explicitly yields control back to the Python interpreter...
/// // ...but it's not always this obvious. Many things do this implicitly.
/// py.allow_threads(|| {
/// // Python code could be mutating through its handle to `bytes`,
/// // which makes reading it a data race, which is undefined behavior.
/// println!("{:?}", slice[0]);
/// });
///
/// // Python code might have mutated it, so we can not rely on the slice
/// // remaining valid. As such this is also undefined behavior.
/// println!("{:?}", slice[0]);
/// }
pub unsafe fn as_bytes(&self) -> &[u8] {
slice::from_raw_parts(self.data(), self.len())
}

/// Get the contents of this buffer as a mutable slice.
/// Extracts a mutable slice of the `ByteArray`'s entire buffer.
///
/// # Safety
/// This slice should only be used for short-lived operations that write to this bytearray
/// without executing any Python code. See the safety note for [PyByteArray::as_bytes].
///
/// Any other accesses of the `bytearray`'s buffer invalidate the slice. If it is used
/// afterwards, the behavior is undefined. The safety requirements of [`PyByteArray::as_bytes`]
/// apply to this function as well.
#[allow(clippy::mut_from_ref)]
pub unsafe fn as_bytes_mut(&self) -> &mut [u8] {
slice::from_raw_parts_mut(self.data(), self.len())
Expand Down