-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix too strict assertion in shuffle code for pandas subclasses #8667
Fix too strict assertion in shuffle code for pandas subclasses #8667
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 29 files ± 0 29 suites ±0 11h 20m 2s ⏱️ + 1h 29m 32s For more details on these failures, see this check. Results for commit 5032d66. ± Comparison against base commit cbc21df. This pull request removes 13 and adds 8 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @jorisvandenbossche. This change looks reasonable to me, but mypy
isn't happy anymore. Could you try to fix this?
Ah, the assert was there just to satisfy the type checker .. I added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the assert was there just to satisfy the type checker ..
I don't know if that was its sole purpose, but it looks like that was at least a helpful side-effect!
type: ignore
works for me here. Thanks for adjusting!
I am testing the p2p shuffle in dask-geopandas (geopandas/dask-geopandas#295), and this fix was needed to get it running.
Pandas does not require that the
_constructor_sliced
attribute for subclasses is a class (type) itself, it should just be a callable to construct the subclassed Series object (that is also how it is used here: it's only used on the line below as a callableworker_for = constructor(worker_for)
). So this assertion is wrong (and not necessary anyway IMO, if this attribute would do something wrong, that's a problem with the subclass in general)I haven't added a test for now (it would require to add a dummy pandas / dask collection subclass boilerplate (given you don't want to depend on an external package like dask-geopandas I think), but that seems a bit much for this simple change).