-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grouped MEI's with insertions in splitvariants.py #687
Conversation
inputs/values/dockers.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert these changes - the dockers will get built and updated automatically when this is merged!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching this bug @kirtanav98. The genotyping from your tests looks good. Here is a summary of fraction of variants in HWE before -> after the changes by ALT allele:
<INS>
: 0.50 -> 0.50<INS:ME:ALU>
: 0.52 -> 0.71<INS:ME:SVA>
: 0.56 -> 0.76<INS:ME:LINE1>
: 0.59 -> 0.75
Just one small request.
inputs/values/dockers.json
Outdated
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should revert this trivial change as well. You can be sure by using git checkout main inputs/values/dockers.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed. Thank you!
d009188
to
7a4d796
Compare
7a4d796
to
6cd0a63
Compare
This address issue 649. svtk vcf2bed uses the ALT field to produce the svtype column in the output BED file. This means that the svtype column includes BND alt alleles and values like INS:ME for MEIs. However, the current and previous SplitVariants tasks in GenotypeBatch matched exactly on the string "INS" when creating insertion-specific BED files, so the MEIs were grouped with BCAs instead. Here the MEI's are grouped together with the insertions when creating the insertion-specific BED files instead of the BCA's. This can allow for further evaluation the impact of this on genotyping. This has been successfully been validated with womtool and cromshell using the 1kgp reference panel inputs. The results of the previous script and docker and the results of the updated script and docker can be found here