-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement DataFrameGroupBy, RollingGroupby, ExpandingGroupby #36
Comments
Hi @hermian 👋 Indeed, mapply currently only implements DataFrame and Series. I think there are a number of edge cases where For the naive approach, you can iterate over a DataFrameGroupBy to get df_or_series, and do mapply stuff with that: pd.concat(df.mapply() for df in data.groupby()) But as mentioned, I think this only works for the 'obvious' cases where groupby() produces sub dataframes or series (you might have to set I think pandarallel has a complete implementation also for RollingGroupby and ExpandingGroupby. But last time I checked, there is only 1 chunk per worker and no way to configure that. So if one chunk takes much longer than the another chunks, in the end you'll have all cores (but one) idle, waiting for the last chunk to finish. In the hope they have exhaustive test cases, it might be worth a shot to port those test cases and do some test driven development here. PRs are welcome! |
About this, while using DataFrameGroupBy.mapply() I got:
I am not adding an example because any DataFrameGroupBy.mapply() will throw those warnings. |
Hey 👋 I've seen them too. It's a warning for pandas v3, so it'll be part of #68 👍 For now I can suppress the warnings since mapply pins |
What you were trying to do (and why)
The problem occurs when the apply function is called after applying groupby to the data frame.
What happened (including reproducible example)
Reproducible example
The text was updated successfully, but these errors were encountered: