Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iterators? #71

Closed
ajavadia opened this issue May 2, 2020 · 7 comments
Closed

iterators? #71

ajavadia opened this issue May 2, 2020 · 7 comments
Labels
enhancement New feature or request

Comments

@ajavadia
Copy link
Member

ajavadia commented May 2, 2020

I don't think retworkx uses iterators, the same way networkx does? This might lead to very large memory usage for large graphs (and possibly hurt time too).

For example often you want a shallow view into the immediate successors of a node, not all the nodes that may succeed it until the end. Same for layers of a dag.

Networkx implements this an an iterator. Terra also uses this as an iterator.

But I think retworkx constructs the full list, and only artificially returns an iterator in terra.

For an example of how this can be problematic for large circuits, see this PR in terra: Qiskit/qiskit#371

@mtreinish
Copy link
Member

mtreinish commented May 6, 2020

The main reason was the ease of implementation, iterators behave a bit differently between rust and python and building a rust function that returns a python iterator wasn't as straightforward as just doing yield data from inside a function and probably would require creating a custom type that implements the python iter protocol and then returning an object of that new type. At least for the first version to get something working it was easier to just return a list.

I agree that for large results implementing it with iterators could be more efficient. But at least so far I haven't seen any scaling limits or memory growth issues with it from the terra dagcircuit. In general memory consumption went down [1] [2] and all graph operations became about an order of magnitude faster in aggregate compared to the previous networkx implementation. Although this is likely just the benefit of using a compiled language and there could still be a scaling limit as things increase in size, or for examples or test cases I haven't seen.

I think we should leave this open as a lower priority future enhancement though and if the need arises (like we start hitting scaling limits) we can prioritize it.

[1] https://qiskit.github.io/qiskit/#transpiler_levels.TranspilerLevelBenchmarks.peakmem_quantum_volume_transpile_50_x_20?commits=95328adb
[2] https://qiskit.github.io/qiskit/#transpiler_levels.TranspilerLevelBenchmarks.peakmem_transpile_from_large_qasm_backend_with_prop?commits=95328adb (this is an eoh qasm example from aqua loaded from qasm: https://github.com/Qiskit/qiskit/blob/master/test/benchmarks/qasm/test_eoh_qasm.qasm )

@mtreinish
Copy link
Member

To implement this though we probably should try to maintain backwards compatibility with the existing functions and add a second method for the places it makes sense to uses iterators. For example, if layers() becomes a bottleneck we could add a layers_iter() function that would return an iterator object instead of a list. That would ensure a smooth transition without potentially breaking users who expect a list returned from a function.

@mtreinish mtreinish added the enhancement New feature or request label May 6, 2020
@mtreinish
Copy link
Member

For reference in case someone would look to take a stab at this the PyO3 library just added an example on how to implement python iterators in rust: https://github.com/PyO3/pyo3/blob/master/examples/rustapi_module/src/pyclass_iter.rs

So to implement this we basically have to wrap the functions in the implementation of PyIterProtocol for a custom struct that has the necessary state to do each iteration. Then add a function which will return an iterator that will create an instance of that struct and return it. For example, with nodes it would be something like (untested so probably doesn't work):

#[pyclass]
struct LayerIter{
    dag: &PyDAG,
    first_layer:  Vec<NodeIndex>
}

#[pyproto]
impl PyIterProtocol for LayerIter {
    fn __next__(mut slf: PyRefMut<Self>) -> IterNextOutput<Vec<PyObject>, &'static str> {
            <insert implementation of each iteration here>
            IterNextOutput::Yield(layer)
            ...
            IterNextOutput::Return("All Layers Returned")
    }
}

#[pyfunction]
fn layers_iter(
    dag: &PyDAG,
    first_layer: Vec<usize>,
) -> PyResult<PyObject> {
    LayerIter{
        dag: dag
        first_layer:  first_layer.map(|x| NodeIndex::new(x)).collection,
     }.into()
}

@mtreinish
Copy link
Member

I took a quick look at doing this last week for #147 and hit a lifetime issue on the borrowed graph object in the iterator struct. Which looks like the issue described in PyO3/pyo3#1085 (and more succinctly described in PyO3/pyo3#1089 ). The only path for doing iterators in the short term until those issues are resolved in PyO3 I can think of is that we generate a full set of results for a function in the rust and wrap and expose that result that as an iterator to Python. This doesn't offer the primary advantage of using iterators, deferring the execution, but it might provide some advantages by deferring the conversion of types in rust to python types. But I suspect that will be pretty minimal, we'll have to benchmark to be sure.

mtreinish added a commit to mtreinish/retworkx that referenced this issue Oct 28, 2020
This commit changes the return type of the bfs_successors function to be
a custom class BFSSuccessors. This new return class implements both the
sequence protocol and iterator protocol. This means that aside from
explicit type checking it should be backwards compatible with the list
being previously returned. It can be used with either index based access
or iterated over.

This should be more efficient for large graphs because instead of doing
the copy and type conversion and iterating over the entire nested Vec of
results it instead does it per access (either via __getitem__ or
__next__). It does add a small amount of overhead for smaller graphs but
it (is minimal since the function returns in microseconds in such cases
so a 10-20% overhead is not a big deal).

It's worth noting while this defers the type conversion, it does not
defer execution like most python iterators normally do. When
bfs_successors is called it will still always fully traverse the graph.
However, in practice the bottleneck for the bfs_successor function
wasn't actually the graph traversal, but instead the type conversion.

Related to Qiskit#71
mtreinish added a commit to mtreinish/retworkx that referenced this issue Oct 28, 2020
This commit changes the return type of the bfs_successors function to be
a custom class BFSSuccessors. This new return class implements both the
sequence protocol and iterator protocol. This means that aside from
explicit type checking it should be backwards compatible with the list
being previously returned. It can be used with either index based access
or iterated over.

This should be more efficient for large graphs because instead of doing
the copy and type conversion and iterating over the entire nested Vec of
results it instead does it per access (either via __getitem__ or
__next__). It does add a small amount of overhead for smaller graphs but
it is minimal since the function returns in microseconds in such cases
so a 10-20% overhead is not a big deal.

It's worth noting while this defers the type conversion, it does not
defer execution like most python iterators normally do. When
bfs_successors is called it will still always fully traverse the graph.
However, in practice the bottleneck for the bfs_successor function
wasn't actually the graph traversal, but instead the type conversion.

Related to Qiskit#71
@mtreinish
Copy link
Member

So the first case where there was a strong need for this came up in the sabre_swap transpiler pass where it was calling bfs_successors a large number of times on a large graph. Looking at a profile on it the bottleneck was the conversion from the Vec<(PyObject, Vec<PyObject>)> to a Python list(object, list(object)) not the actual graph traversal. This meant we could use an iterator to address the performance issue there because the piece we could defer with an iterator was the actual conversion.

I did this in #185 and to use an iterator there without worrying about the current pyo3 limitations it just moves the result Vec<(object, Vec<object>)> into a custom struct that implements the python iter protocol. I also implemented the python sequence protocol so it would be backwards compatible with the list in usage (but not anywhere explicit typing was required).

This is probably a model we can use in other places when/if we encounter bottlenecks around type conversion for large outputs from retworkx functions.

mtreinish added a commit that referenced this issue Nov 11, 2020
* Add custom iterator class for BFS successors return

This commit changes the return type of the bfs_successors function to be
a custom class BFSSuccessors. This new return class implements both the
sequence protocol and iterator protocol. This means that aside from
explicit type checking it should be backwards compatible with the list
being previously returned. It can be used with either index based access
or iterated over.

This should be more efficient for large graphs because instead of doing
the copy and type conversion and iterating over the entire nested Vec of
results it instead does it per access (either via __getitem__ or
__next__). It does add a small amount of overhead for smaller graphs but
it is minimal since the function returns in microseconds in such cases
so a 10-20% overhead is not a big deal.

It's worth noting while this defers the type conversion, it does not
defer execution like most python iterators normally do. When
bfs_successors is called it will still always fully traverse the graph.
However, in practice the bottleneck for the bfs_successor function
wasn't actually the graph traversal, but instead the type conversion.

Related to #71

* Only implement sequence protocol

Using the sequence protocol we can still get an implicit iterator by
just casting it on the python side. This will still get us the lazy type
conversion but simplify the api and also make the behavior more
consistent.

At the same time to ensure we're handling negative indices correctly a
test method is added to verify that a negative index access to the
sequence works as expected.

* Update src/lib.rs

Co-authored-by: Kevin Krsulich <[email protected]>
mtreinish added a commit to mtreinish/retworkx that referenced this issue Nov 16, 2020
This commit adds a new return type NodeIndices that is used as the
return type for functions that return Vec<usize> when its used as the
return type to return a list of node indices. This custom type implements
the Python sequence protocol and can be used as the list which was
previously returned except for where explicit type checking was used.
Just as in Qiskit#185 the advantage here is that for large lists of node
indices the type conversion from a Vec<usize> to list(int) becomes a
large bottleneck. This avoids that conversion and only does a usize to
python int conversion on access.

Related to Qiskit#71
mtreinish added a commit that referenced this issue Nov 19, 2020
* Add NodeIndices return type

This commit adds a new return type NodeIndices that is used as the
return type for functions that return Vec<usize> when its used as the
return type to return a list of node indices. This custom type implements
the Python sequence protocol and can be used as the list which was
previously returned except for where explicit type checking was used.
Just as in #185 the advantage here is that for large lists of node
indices the type conversion from a Vec<usize> to list(int) becomes a
large bottleneck. This avoids that conversion and only does a usize to
python int conversion on access.

Related to #71

* Add tests for comparison

* Apply suggestions from code review

Co-authored-by: Lauren Capelluto <[email protected]>
@mtreinish
Copy link
Member

I'm going to close this issue for now. We've found a pattern for using custom return types that wrap an inner rust object and implement the python protocols for doing conversion to a python object dynamical on access. This seems to work well and avoids most of the overhead we were seeing building large return objects before. However, these objects do not defer execution like python iterators do, and but we're blocked from doing that by limitations currently in pyo3. I think we can re-investigate doing it when pyo3 adds support for borrowed references to python object to exist in a struct.

@georgios-ts
Copy link
Collaborator

georgios-ts commented Sep 16, 2021

Came by to say that this is actually possible. We need to wrap the graph in Py<> to avoid specifying a lifetime instead of a simple reference as in @mtreinish 's code #71 (comment) (see https://pyo3.rs/main/doc/pyo3/prelude/struct.Py.html).
I did this in georgios-ts@10e8011.
One limitation in general is that the Iter struct can't hold any rust struct that needs a lifetime.

But i'm against using it at all since it seems to open the door in an "unsafe" interaction between Rust and Python and it's harder to reason about the output of the functions. For example, the underlying graph can mutate while we still hold an iterator over its layers, the next iterations will peek that change and even if we don't panick, we can't at least guarantee a meaningful output (it's up to the user to not break it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants