-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: reading variants as sparse in pgenlibr #291
Comments
Okay, I plan to provide this functionality on GitHub later this week, though the next CRAN release will wait till mid-year. |
HasSparseHardcalls(pgen, variant_num) now returns whether variant_num has a sparse representation. If it does, ReadSparseHardcalls(pgen, variant_num) returns an object where "sample_nums" has the sample indexes, and "counts" has the counts. |
Thanks that sounds great, looking forward to testing it! |
Small follow-up: is |
|
Hi,
We rely on the C++ pgenlibr to read PGEN format genotype data in REGENIE. The current functions available (RPgenReader::ReadIntHardcalls/RPgenReader::Read) reads in the genotype data for all samples. In the case of rare variants, it seems PGEN stores the data sparsely based on the format documentation (i.e. only indices and genotypes of carriers are stored). Could this functionality be provided in the C++ pgenlibr, i.e. a flag identifying whether a variant is stored sparsely (given its index) as well as a function that returns the indices & genotypes (or dosages) of the carriers only?
Thanks,
Joelle
The text was updated successfully, but these errors were encountered: