Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new untyped PyAnyArray base type of PyArray. #369

Merged
merged 2 commits into from
Mar 5, 2023
Merged

Conversation

adamreichold
Copy link
Member

No description provided.

@davidhewitt
Copy link
Member

Neat! Sorry for the slight delay, my family and I have been sick... again... 🙃

The Deref problem... I've hit this myself in PyO3 occasionally. Also there are cases where it might be nice to have e.g. PyCell<Base> deref to PyCell<Super>, or PyBool defer to PyInt. So maybe we can investigate changing PyO3? I think the biggest pain point was any generic code which used T: Deref<Target = PyAny>.

@adamreichold
Copy link
Member Author

I think the biggest pain point was any generic code which used T: Deref<Target = PyAny>.

The main issue here is that auto-deref chaining does not apply, right?

@adamreichold adamreichold changed the title Add new untype PyAnyArray base type of PyArray. Add new untyped PyAnyArray base type of PyArray. Feb 28, 2023
@davidhewitt
Copy link
Member

Yes exactly, I wondered whether we could express the chain with a separate trait somehow to work around that.

@adamreichold
Copy link
Member Author

Yes exactly, I wondered whether we could express the chain with a separate trait somehow to work around that.

Something like

trait PyAsAny {
   fn as_any(&self) -> &PyAny;
}

impl<R,S,T> PyAsAny for R where R: Deref<Target=S>, S: Deref<Target=T>, T: Deref<Target=PyAny> {
  fn as_any(&self) -> &PyAny {
    self.deref().deref().deref()
  }
}

etc., probably generated via a declarative macro?

@adamreichold
Copy link
Member Author

Since you mentioned this, I am somewhat concerned about downstream code which would have to migrate from : Deref<Target=PyAny> to : PyAsAny and I am actually not really confident whether it would result in significant breakage if this is published as-is because PyArray<T,D>: Deref<Target=PyAny> would not hold any more.

@davidhewitt
Copy link
Member

davidhewitt commented Feb 28, 2023

Yes, that concern of downstream breakage is what I was unclear on too. With PyAsAny we could make PyO3 work, but I have no idea how much we'd break. And changing the Deref implementing felt mostly like a refinement than a major improvement (e.g. could just use .extract() or .downcast_unchecked() to go from PyBool to PyInt). So I kind of stalled when trying this in the past.

@adamreichold
Copy link
Member Author

Putting my take-one-for-the-team hat on, maybe we should use rust-numpy as a test balloon and do this here to "only" break crates using rust-numpy, but not all of the PyO3 ecosystem. If we actually break anything that is. The code in rust-numpy itself was completely unaffected because auto-deref chaining was apparently sufficient.

I think the pattern fits well often enough as you describe above and we have to take a step a some time so it might be this one where the breakage would be limited.

(Concerning the PyO3-specific crater thing: I don't think it would be completely unreasonable to set this up. It could basically be a set of Python scripts booting into a single throw-away VM (we don't need to isolate the builds from each other AFAIU, just my user account from the builds) and then building the relevant crates with a patch section pointing to a given ref in the PyO3 repository. The main issue I see ATM, is that crates.io lists less than 300 reverse dependencies for PyO3 and most of them use older versions. Which makes sense as Python extensions are not distributed via crates.io. So I guess we would need to crawl PyPI for packages using PyO3?)

@kngwyu
Copy link
Member

kngwyu commented Mar 1, 2023

I'm a bit worried that this name looks somewhat similar to the array of PyAny (i.e. PyArray<PyObject>), while they are completely different.

@adamreichold
Copy link
Member Author

I'm a bit worried that this name looks somewhat similar to the array of PyAny (i.e. PyArray<PyObject>), while they are completely different.

I am personally very much not married to the name and would be very open to suggestions for alternatives.

@kngwyu
Copy link
Member

kngwyu commented Mar 1, 2023

I've thought of a few ideas (PyRawArray, PyUnresolvedArray, PyUntypedArray, ...), but so far, none of them are very good 😅
Maybe we can just emphasize the difference in the document if we can't come up with a nice idea.

@adamreichold
Copy link
Member Author

I am somewhat endeared with PyUntypedArray. I would also propose PyArrayBase. But let's see whether anybody else has an opinion before making a choice...

@davidhewitt
Copy link
Member

PyUntypedArray seems good to me. There's also the option of making this available as a different import "flavour", e.g. numpy::untyped::PyArray, a bit like how once_cell has different once_cell::sync, once_cell::unsync etc.

@davidhewitt
Copy link
Member

Putting my take-one-for-the-team hat on, maybe we should use rust-numpy as a test balloon and do this here to "only" break crates using rust-numpy, but not all of the PyO3 ecosystem. If we actually break anything that is. The code in rust-numpy itself was completely unaffected because auto-deref chaining was apparently sufficient.

I think the pattern fits well often enough as you describe above and we have to take a step a some time so it might be this one where the breakage would be limited.

Seems reasonable to give this a shot for sure!

@adamreichold
Copy link
Member Author

PyUntypedArray seems good to me. There's also the option of making this available as a different import "flavour", e.g. numpy::untyped::PyArray, a bit like how once_cell has different once_cell::sync, once_cell::unsync etc.

If two people can agree on it, it is golden and I will change the PR to use PyUntypedArray.

I don't import flavours are helpful here: There are easy to misread (same problem as PyAnyArray) and while they do serve a purpose for once_cell (I can basically change the import without changing the usage sites), this does not apply for us here. On the contrary, we most likely need to both flavours in scope to downcast to the concrete type after inspecting the untyped representation, so I suspect that it would mostly be used as use numpy::untyped::PyArray as PyUntypedArray;.

@adamreichold
Copy link
Member Author

Seems reasonable to give this a shot for sure!

I changed the name to PyUntypedArray and am planning to merge this soonish with the idea that we give breaking : Deref<Target=PyAny> bounds a try and aim to collect feedback as early as possible and that providing a replacement bound via : PyAsAny would happen in PyO3 and will most likely not require major changes here.

Or are there any other concerns besides the name and the deref problem w.r.t. the current implementation?

Copy link
Member

@kngwyu kngwyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicking comments on CHANGELOG and documents. Implementations look pretty reasonable for me.

CHANGELOG.md Outdated Show resolved Hide resolved
src/untyped_array.rs Show resolved Hide resolved
src/untyped_array.rs Show resolved Hide resolved
src/untyped_array.rs Outdated Show resolved Hide resolved
… which can be useful to guide type inference.
@adamreichold adamreichold merged commit dabc780 into main Mar 5, 2023
@adamreichold adamreichold deleted the any-array branch March 5, 2023 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants