-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for NPE when GDB has too many alleles #7738
Changes from 3 commits
3fee9d9
c5a51ee
049b315
e1427ff
4cc67d9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -124,11 +124,14 @@ public AFCalculationResult calculate(final VariantContext vc) { | |
* Compute the probability of the alleles segregating given the genotype likelihoods of the samples in vc | ||
* | ||
* @param vc the VariantContext holding the alleles and sample information. The VariantContext | ||
* must have at least 1 alternative allele | ||
* must have at least 1 alternative allele. Hom-ref genotype likelihoods can be approximated | ||
* but | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but .... ? |
||
* @return result (for programming convenience) | ||
*/ | ||
public AFCalculationResult calculate(final VariantContext vc, final int defaultPloidy) { | ||
Utils.nonNull(vc, "VariantContext cannot be null"); | ||
Utils.validate(vc.getGenotypes().stream().anyMatch(Genotype::hasLikelihoods), | ||
"VariantContext must contain at least one genotype with likelihoods"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In these error messages, it might be good to include a list of ways that you could end up with genotypes with no likelihoods, to help users debug. |
||
final int numAlleles = vc.getNAlleles(); | ||
final List<Allele> alleles = vc.getAlleles(); | ||
Utils.validateArg( numAlleles > 1, () -> "VariantContext has only a single reference allele, but getLog10PNonRef requires at least one at all " + vc); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -345,6 +345,23 @@ public void testGDBMaxAltsEqualsGGVCFsMaxAlts() { | |
File output = runGenotypeGVCFS(genomicsDBUri, null, args, b37_reference_20_21); | ||
} | ||
|
||
@Test | ||
public void testGenotypingOnVCWithMissingPLs() { | ||
//this regression test input: | ||
// 1) is missing PLs because it had more than the allowed number of alts for GDB | ||
// 2) has enough GQ0 samples to achieve a QUAL high enough to pass the initial threshold | ||
final String input = toolsTestDir + "/walkers/GenotypeGVCFs/test.tooManyAltsNoPLs.g.vcf"; | ||
final List<String> args = new ArrayList<String>(); | ||
args.add("-G"); | ||
args.add("StandardAnnotation"); | ||
args.add("-G"); | ||
args.add("AS_StandardAnnotation"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider using |
||
final File output = runGenotypeGVCFS(input, null, args, hg38Reference); | ||
final Pair<VCFHeader, List<VariantContext>> outputData = VariantContextTestUtils.readEntireVCFIntoMemory(output.getAbsolutePath()); | ||
//only variant is a big variant that shouldn't get output because it is missing PLs, but we should skip it gracefully | ||
Assert.assertEquals(outputData.getRight().size(), 0); | ||
} | ||
|
||
private void runAndCheckGenomicsDBOutput(final ArgumentsBuilder args, final File expected, final File output) { | ||
Utils.resetRandomGenerator(); | ||
runCommandLine(args); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are users likely to encounter this error in practice? If yes, maybe add a bit more contextual info to make it less cryptic (such as the fact that we're doing allele subsetting and calculating the most likely alleles)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this would be if a developer called this method under the wrong circumstances. In the buggy behavior we spent a loooooong time calculating on all-zero data, which is weird and annoying and this would definitely put a stop to it.