-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/rank features group #546
Feature/rank features group #546
Conversation
Note: function parameters are not validated as extensively as in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're a legend @VladimirShitov !
- Left lots of minor comments
- A file named
_utils
is usually a slight code smell because ideally ever piece of code should have a clear purpose. I would actually move ALL of the feature_ranks code including this into a_feature_ranks_groups.py
. What do you think? IMO we have lots of customization now and this would make sense.
Thank you so much.
Thank you for your comments, Lukas! Please, check the discussion on renaming |
Signed-off-by: zethson <[email protected]>
PR Checklist
docs
is updatedDescription of changes
Previously, for non-numerical features the standard statistical test was run by
ep.tl.rank_features_groups
(e.g. Wilcoxon rank sum test). This PR adds functionality to run statistical tests specifically developed for categorical features (e.g. Chi-square test).Technical details
scanpy.tl.rank_genes_groups
is used. E.g., when the reference is set to "rest", for each subgroup ofgroupby
, the composition of a categorical variable is compared to the composition in all other groups mixed together. This is not a common approach, I would say, but it is consistent withscanpy
, which is used for numerical features.