-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update X matrix validation to validate Visium assays #876
Conversation
# Conflicts: # cellxgene_schema_cli/tests/test_schema_compliance.py
is_sparse_matrix = matrix_format in SPARSE_MATRIX_TYPES | ||
for matrix_chunk, _, _ in self._chunk_matrix(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to _validate_raw_data
has_row_of_zeros = True | ||
|
||
if not has_invalid_nonzero_value: | ||
data = matrix_chunk if isinstance(matrix_chunk, np.ndarray) else matrix_chunk.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to _matrix_has_invalid_nonzero_values for re-use
@@ -38,6 +41,7 @@ def __init__(self, ignore_labels=False): | |||
self.schema_def = dict() | |||
self.schema_version: str = None | |||
self.ignore_labels = ignore_labels | |||
self.visium_and_is_single_true_matrix_size = VISIUM_AND_IS_SINGLE_TRUE_MATRIX_SIZE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: in the future you can patch VISIUM_AND_IS_SINGLE_TRUE_MATRIX_SIZE
in the tests to avoid having this class variable. Just one less duplicated variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good everyone, works as defined with spatial and non-spatial datasets and +/- a normalized matrix. Thanks!
Reason for Change
Changes
obs['in_tissue'] == 0
rows CAN be all zeros--as long as there is at least ONE non-zero value among the group of in_tissue == 0 rows)- this includes datasets where visium + is_single + in_tissue is 1 for all cases because that pathway has the same value ruleset as the other assays we validate for. The only diff is validating 4992 rows for visium + is_single datasets, so that check is done outside of this helper function.