Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed an edge case in HaplotypeCaller where filtered alleles in the vicinity of forced-calling alleles could result in empty calls #7740

Merged
merged 3 commits into from
Mar 30, 2022

Conversation

jamesemery
Copy link
Collaborator

@jamesemery jamesemery commented Mar 28, 2022

Fixes #7741

@jamesemery
Copy link
Collaborator Author

@davidbenjamin you are on the git blame here but you might not be the one to review. A user complained about empty calls in the vicinity of force calling alleles like this:chr5 125895106 . G . 13.34 . AN=2;DP=31;MQ=59.46 GT:AD:DP 0/0:31:31 https://gatk.broadinstitute.org/hc/en-us/community/posts/4664460343195-Haplotypecaller-4-5-2-0-alleles-unexpected-output. It turns out what was happening is that on the same assembly window as a genuine force-calling allele if there are variants that get genotyped as het variants but fall below our filtering threshold for allele calling there was a bug where we would not kick out of constructing the output VC for that site and consequently we will end up calling a empty hom ref call even if we don't intend to.

@davidbenjamin
Copy link
Contributor

@jamesemery I will gladly review. If I understand the code change it seems like there was already basically the right logic to avoid this but the code was neglecting to put the force calling alleles in a representation consistent with the output VCF. And the fix is simply to compute forcedAlleles = AssemblyBasedCallerUtils.getAllelesConsistentWithGivenAlleles(givenAlleles, vc) a bit upstream of where we were doing so previously. If I've got that right, this PR gets my 👍.

Copy link
Contributor

@davidbenjamin davidbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is entirely sensible.

@jamesemery
Copy link
Collaborator Author

jamesemery commented Mar 29, 2022

@davidbenjamin looks like i have to wait for the test to pass because travis is having problems. What was happening is that the list of alleles didn't necessarily overlap with the VC being evaluated here (the AssemblyBasedCallerUtils.getAllelesConsistentWithGivenAlleles(givenAlleles, vc) was among other things responsible for filtering out alleles that don't overlap the site at all. This was responsible for the users issue because there were alleles from adjacent --alleles file variants 20 bases downstream that caused that condition to fail.

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a test that demonstrates the issue

@jamesemery
Copy link
Collaborator Author

@droazen I added a test with some documentation about the issue and the no-calls.

@droazen droazen dismissed their stale review March 29, 2022 20:14

Tests added as requested

@jamesemery
Copy link
Collaborator Author

@davidbenjamin Can you look at the test? I didn't want to check in a file with the old erroneous behavior so its hard to demonstrate what this fixed but i tried to make it clear in the comments.

@gatk-bot
Copy link

Travis reported job failures from build 38317
Failures in the following jobs:

Test Type JDK Job ID Logs
variantcalling openjdk8 38317.4 logs

@jamesemery jamesemery force-pushed the je_fixNoCallsInVicinityOfFocedAllelesInHC branch from 4392dce to 4065504 Compare March 30, 2022 14:34
@davidbenjamin
Copy link
Contributor

@jamesemery The test looks good. I am averse to exact match tests and would prefer to just strea the VCF output and check for empty calls. Nonetheless, I understand that there's a tradition of exact match tests in some of our tools. I leave it up to you.

@jamesemery jamesemery merged commit 527e5d7 into master Mar 30, 2022
@jamesemery jamesemery deleted the je_fixNoCallsInVicinityOfFocedAllelesInHC branch March 30, 2022 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Haplotypecaller 4.5.2.0 --alleles unexpected output
4 participants