Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check if 3.10 successful before merge #785

Closed
wants to merge 5 commits into from

Conversation

eroell
Copy link
Collaborator

@eroell eroell commented Aug 13, 2024

PR Checklist

  • This comment contains a description of changes (with reason)
  • Referenced issue is linked
  • If you've fixed a bug or added code that should be tested, add tests!
  • Documentation in docs is updated

Description of changes

Technical details
There is a very strange error going on, occasionally in the Ubuntu 3.10 tests, and the Ubuntu 3.11 pre-release tests:

FAILED tests/preprocessing/test_highly_variable_features.py::test_highly_variable_features - ehrapy.io._read.IndexNotFoundError:
Could not create AnnData object while reading file /home/runner/work/ehrapy/ehrapy/ehrapy_data/dermatology.csv .
Does index_column named patient_id exist in /home/runner/work/ehrapy/ehrapy/ehrapy_data/dermatology.csv?

The error is related to the dermatology.csv file, read in here

def dermatology(
encoded: bool = False,
columns_obs_only: dict[str, list[str]] | list[str] | None = None,
) -> AnnData:
"""Loads the Dermatology Data Set
More details: http://archive.ics.uci.edu/ml/datasets/Dermatology
Preprocessing: https://github.com/theislab/ehrapy-datasets/blob/main/dermatology/dermatology.ipynb
Args:
encoded: Whether to return an already encoded object
columns_obs_only: Columns to include in obs only and not X.
Returns:
:class:`~anndata.AnnData` object of the Dermatology Data Set
Examples:
>>> import ehrapy as ep
>>> adata = ep.dt.dermatology(encoded=True)
"""
adata = read_csv(
dataset_path=f"{ehrapy_settings.datasetdir}/dermatology.csv",
download_dataset_name="dermatology.csv",
backup_url="https://figshare.com/ndownloader/files/34179300",
columns_obs_only=columns_obs_only,
index_column="patient_id",
)
if encoded:
infer_feature_types(adata, output=None, verbose=False)
return encode(adata, autodetect=True)
return adata

Quite flaky, only pops up sometimes when repeating the tests

Additional context

@github-actions github-actions bot added the bug Something isn't working label Aug 13, 2024
@eroell
Copy link
Collaborator Author

eroell commented Aug 13, 2024

how even

@eroell
Copy link
Collaborator Author

eroell commented Aug 13, 2024

Seems to work now again - everywhere. Close for now, hopefully no one will ever need to search for this, and if so, this PR with the posted error will pop up

@eroell eroell closed this Aug 13, 2024
@eroell eroell deleted the fix/ubuntu-3.10-dermatology-dataset branch October 24, 2024 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant