Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pandas 3.0 #68

Open
dxdc opened this issue Feb 22, 2024 · 5 comments · Fixed by #70
Open

Support pandas 3.0 #68

dxdc opened this issue Feb 22, 2024 · 5 comments · Fixed by #70
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@dxdc
Copy link

dxdc commented Feb 22, 2024

The use of array_split from NumPy is now reporting as deprecated.

dfs = array_split(df_or_series, n_chunks, axis=opposite_axis)

With the latest numpy, it gives warnings, i.e.

'Series.swapaxes' is deprecated and will be removed in a future version:FutureWarning
'DataFrame.swapaxes' is deprecated and will be removed in a future version:FutureWarning

Some more details here: numpy/numpy#23217 and numpy/numpy#24889, in particular this comment: numpy/numpy#24889 (comment) for a proposed resolution.

The explanation here is that np.array_split somewhat magically works on a pandas DataFrame because the implementation of that function under the hood only uses features that happen to work the same on an array of dataframe. But, one of those is the np.swapaxes function, which when called on a DataFrame will call the swapaxes method of that DataFrame. However, this method on the DataFrame is deprecated (for a DataFrame, which is 2D it does nothing different than transpose), and so that means that once this method is removed from pandas (probably in pandas 3.0), calling np.array_split on a DataFrame will also stop working.

It's unclear if pandas (or numpy) may address this at some point?

@dxdc dxdc added the bug Something isn't working label Feb 22, 2024
@ddelange
Copy link
Owner

many thanks for the report!

@ddelange
Copy link
Owner

ddelange commented Mar 3, 2024

I did some digging: numpy/numpy#24889 (comment)

I will keep this issue open until mapply + pandas v3 compatibility is confirmed, but I think no further action is required here.

@dxdc
Copy link
Author

dxdc commented Mar 3, 2024

nice research @ddelange! is there a way to hide the warnings in the short term? e.g., something like this:

import warnings

# Filter out specific deprecation warnings
warnings.filterwarnings("ignore", message=".*Series.swapaxes is deprecated and will be removed in a future version.*")
warnings.filterwarnings("ignore", message=".*DataFrame.swapaxes is deprecated and will be removed in a future version.*")

Copy link

github-actions bot commented Mar 4, 2024

Released 0.1.25

@ddelange
Copy link
Owner

ddelange commented Mar 6, 2024

quick update: tried make test with pandas 3.0

pip install -U --pre --extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple pandas

and all tests (series, dataframe, groupby) will start failing due to various reasons. see for instance numpy/numpy#24889 (comment)

@ddelange ddelange changed the title array_split deprecated Support pandas 3.0 Mar 9, 2024
@ddelange ddelange reopened this Mar 9, 2024
@ddelange ddelange added the help wanted Extra attention is needed label Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants