Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: determine species from gene file, not from gene prefix #1253

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

joyceyan
Copy link
Contributor

@joyceyan joyceyan commented Feb 7, 2025

Reason for Change

it seems like the issue with RR prefixed genes with fruit fly was a bit of a patchwork solution to a slightly bigger problem, which is that inferring organism species based on gene id prefix is not entirely correct. see this issue with a gorilla gene prefixed with ENSGGOG gets incorrectly mapped to homo sapiens

Changes

  • updates implementation of gencode.get_organism_from_feature_id to go through each individual gene checker, rather than rely on gene prefix

Testing

  • updates a test in test_gencode to assert that we're mapping the organism from feature id correctly

Notes for Reviewer

@joyceyan joyceyan requested a review from Bento007 February 7, 2025 20:57
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.88%. Comparing base (3107d81) to head (ca6f1c3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1253      +/-   ##
==========================================
+ Coverage   89.64%   89.88%   +0.23%     
==========================================
  Files          19       19              
  Lines        2222     2194      -28     
==========================================
- Hits         1992     1972      -20     
+ Misses        230      222       -8     
Components Coverage Δ
cellxgene_schema_cli 90.83% <100.00%> (+0.34%) ⬆️
migration_assistant 91.26% <ø> (ø)
schema_bump_dry_run_genes 79.74% <ø> (ø)
schema_bump_dry_run_ontologies 99.53% <ø> (ø)

Copy link
Contributor

@Bento007 Bento007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a nice fix.

@joyceyan joyceyan merged commit b7e96bf into main Feb 7, 2025
14 checks passed
@joyceyan joyceyan deleted the joyce/determine-species branch February 7, 2025 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants