Guidance on exporting large nested Rust structs to Python #1701

cswinter · 2021-06-27T11:11:43Z

I am creating a Python API for a Rust library. Some of the methods return nested Rust structs.

For example:

#[pymethods]
impl Client {
  fn list(&self) -> PyResult<Vec<XpStatus>> {
    // ...
  }
}

#[pyclass]
pub struct XpStatus {
    pub xp: Xp,
    pub container_status: HashMap<ContainerId, ContainerStatus>,
    pub containers_by_lifecycle: HashMap<ContainerStatusKind, Vec<ContainerId>>,
    pub max_containers: u64,
}

#[pyclass]
pub struct Xp {
    pub def: XpDef,
    pub uid: XpId,
    pub lifecycle: XpLifecycle,
    pub creation_time: DateTime<Utc>,
    pub priority: i64,
    pub queue_pos: u64,
}

pub enum ContainerStatus {
    Running,
    Creating,
    Completed {
        exit_code: u64,
        error: String,
        finished_at: DateTime<Utc>,
    },
    None,
}

// ...

I would like the Python code to be able to access all members of the returned structs.

The simplest option might be to define getters on all members. However, unless I'm mistaken this would seem to require copying the entire substructure on each access which would make it very expensive to iterate over collections contained in the struct from Python code.

To avoid this performance hit, we need to fully convert the struct into a Python compatible object. I can think of two different ways this could be achieved.

Create a second version for each struct in Rust which is compatible in Python and then manually convert these from Rust code.
This would look something like this:

#[pyclass(name="XpStatus")]
pub struct PyXpStatus {
  #[getter]
  pub xp: PyCell<PyXp>,
  #[getter]
  pub container_status: PyDict,
  // ...
}

impl From<XpStatus> for PyXpStatus {
  // ...
}

Probably this could even be generated automatically by a macro.

Create Python versions of all structs in Python, and instantiate those directly.
If we're going to create a new version of all structs anyway, we might as well do so in Python. This has the added benefits of allowing for a slightly more idiomatic API and also making "jump to source" work so Python users can look at the Python definitions of all classes rather than an opaque stub or Rust source.
I think this might be the preferred solution. I'm still slightly unsure how to best convert the Rust structs into Python classes on the Rust side. When building a mixed Rust/Python project with maturin, can you just use PyModule::import to import the Python portion of the module on the Rust side? Or would you use PyModule::from_code and include_str?

Does this seem like a reasonable approach?

The text was updated successfully, but these errors were encountered:

davidhewitt · 2021-06-27T21:06:12Z

Option 1 is probably what I would pick for now.

Note that in the future I would like it to be possible for #[pyo3(get)] to avoid cloning the underlying data - as per #1358 (comment).

This is in reality still some way off.

mejrs · 2021-06-27T23:18:32Z

We should probably add a section in the guide that discusses this stuff in depth.

To avoid this performance hit

You're going to have to take that hit somewhere on the Rust/Python boundary. You could allocate everything on the Python heap (so, you'd have Py<...> wrappers everywhere) which avoids the cloning but this just moves some cost to conversions when Rust code needs to work on the structs.

The simplest option might be to define getters on all members. However, unless I'm mistaken this would seem to require copying the entire substructure on each access which would make it very expensive to iterate over collections contained in the struct from Python code.

Also, it would return a fresh clone on every access, so Python code wouldn't be able to mutate the collection. See https://pyo3.rs/main/faq.html#pyo3get-clones-my-field

Here's a third approach:

#[pyclass]
struct Foo{
	bar: Py<Bar>
}

#[pyclass]
struct Bar{
	inner: HashMap<Py<ContainerId>, Py<ContainerStatus>>
}

#[pyproto]
impl PyMappingProtocol for Bar {
	/* todo */
}

#[pyproto]
impl PyIterProtocol for Bar {
	/* todo */
}

#[pyclass]
struct BarIter{
	inner: Py<Bar>,
	state: /* todo */
}

#[pyproto]
impl PyIterProtocol for BarIter {
	/* todo */
}

What is best will depend on what exactly you are doing (and benchmarks, probably). YMMV.

cswinter · 2021-06-29T15:31:21Z

Related question, how do you create something like a Py<PyDateTime> or Py<PyList> in Rust? All creation methods I can find seem to return borrows rather than owned values.

mejrs · 2021-06-29T16:01:27Z

You can use .into() to do that for the native Python types:

use pyo3::prelude::*;
use pyo3::types::PyDict;

struct Bar {
    inner: Py<PyDict>,
}

impl Bar {
    fn new() -> Bar {
        Python::with_gil(|py| {
            let dict: Py<PyDict> = PyDict::new(py).into();
            Bar {
                inner: dict,
            }
        })
    }
}

You can use Py::new() if you want to store a pyclass.

cswinter · 2021-06-30T18:36:43Z

I tried option 1 and it's a fine solution, but ended up going with option 2 instead where I just define the classes in Python and import them and convert on the Rust side. I'm quite happy with this, it's roughly the same amount of effort/boilerplate to creating Python-compatible structs in Rust but I also get MyPy type annotations and all the @dataclass goodies.

Congyuwang · 2021-10-29T23:12:51Z

Just a question. Performance wise, how does doing this: "Create a second version for each struct in Rust which is compatible in Python and then manually convert these from Rust code." compared to using a convenient crate called pythonize, which uses serde

davidhewitt · 2021-10-30T10:23:27Z

My gut feeling is that you can get better performance from doing hand-written code over using pythonize, however I would suggest benchmarking! Note that I haven't put much effort into optimizing pythonize, although PRs to speed it up would be welcome if you find they benefit you.

Congyuwang · 2021-10-31T16:00:34Z

My own experimentation seems to show little difference between using pythonize and hand-coding to_python method. Didn’t do any crazy optimisation though. I guess most of the time is spent on instantiating and filling up PyList and PyDict and such. pythonize does not seem to have so much overhead.

davidhewitt added documentation question labels Aug 26, 2021

PyO3 locked and limited conversation to collaborators Oct 31, 2021

davidhewitt closed this as completed Oct 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Guidance on exporting large nested Rust structs to Python #1701

Guidance on exporting large nested Rust structs to Python #1701

cswinter commented Jun 27, 2021

davidhewitt commented Jun 27, 2021

mejrs commented Jun 27, 2021

cswinter commented Jun 29, 2021

mejrs commented Jun 29, 2021

cswinter commented Jun 30, 2021

Congyuwang commented Oct 29, 2021

davidhewitt commented Oct 30, 2021

Congyuwang commented Oct 31, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Guidance on exporting large nested Rust structs to Python #1701

Guidance on exporting large nested Rust structs to Python #1701

Comments

cswinter commented Jun 27, 2021

davidhewitt commented Jun 27, 2021

mejrs commented Jun 27, 2021

cswinter commented Jun 29, 2021

mejrs commented Jun 29, 2021

cswinter commented Jun 30, 2021

Congyuwang commented Oct 29, 2021

davidhewitt commented Oct 30, 2021

Congyuwang commented Oct 31, 2021

This issue was moved to a discussion.