-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update SV split-read strand validation and clustering #8378
Conversation
…ow INV null strands Improve test coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor possible fix and a few questions.
@@ -386,8 +386,8 @@ public static SVCallRecord create(final VariantContext variant, boolean keepVari | |||
} else { | |||
contigB = contigA; | |||
// Force reset of END coordinate | |||
if (type.equals(StructuralVariantType.INS)) { | |||
positionB = positionA + 1; | |||
if (type == GATKSVVCFConstants.StructuralVariantAnnotationType.INS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the .equals
for string matching -- any reason you changed it?
Also why the END coordinate change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually causing a bug where .equals(StructuralVariantType.INS)
is always false because the type is different. Somehow I missed this when switching to use GATKSVVCFConstants.StructuralVariantAnnotationType
in an earlier PR. Using ==
is safer since it won't compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to END is a correction. In the spec, POS=END for insertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem a little confusing to have ==
here and .equals
in other parts of the code like on lines 337 and 349 above. Is there a reason for that? Should we change it everywhere to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwhelan Agreed we should be consistent. I think ==
is a slightly better choice for the type checking so I've changed all the comparisons I could find to that.
@@ -27,8 +27,6 @@ public class SVCallRecordUtilsUnitTest { | |||
private static final Map<String, Object> TEST_ATTRIBUTES = Collections.singletonMap("TEST_KEY", "TEST_VAL"); | |||
private static final Map<String, Object> TEST_ATTRIBUTES_CPX = Lists.newArrayList( | |||
new AbstractMap.SimpleImmutableEntry<String, Object>("TEST_KEY", "TEST_VAL"), | |||
new AbstractMap.SimpleImmutableEntry<String, Object>(GATKSVVCFConstants.END2_ATTRIBUTE, 2000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the complex variant format changing too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this was just a redundant entry in that map. The helper function in SVUtils sets this already and I wanted to test cases where END2 wasn't present.
@@ -457,54 +474,99 @@ public Object[][] testCreateData() { | |||
ALLELES_DEL, Lists.newArrayList(GENOTYPE_DEL_1, GENOTYPE_DEL_2), 1000, "+-", | |||
GATKSVVCFConstants.StructuralVariantAnnotationType.DEL, Collections.singletonList(GATKSVVCFConstants.DEPTH_ALGORITHM), | |||
null, null, | |||
TEST_ATTRIBUTES), | |||
TEST_ATTRIBUTES, -10.), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is QUAL = 100?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we actually have our variant qualities scaled 0-999 still. In the future we will probably scale them down (we have already done so with GQs). I'm actually going to up this to 90.
to ensure there isn't a problem with that.
Recover test files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine, just a couple of small comments.
@@ -212,10 +212,6 @@ private static Pair<Boolean, Boolean> inferStrands(final GATKSVVCFConstants.Stru | |||
} | |||
return Pair.of(Boolean.FALSE, Boolean.TRUE); | |||
} else { | |||
if (type.equals(GATKSVVCFConstants.StructuralVariantAnnotationType.INV)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The javadoc comment for the method seems to be out of date now, can you update it?
@@ -386,8 +386,8 @@ public static SVCallRecord create(final VariantContext variant, boolean keepVari | |||
} else { | |||
contigB = contigA; | |||
// Force reset of END coordinate | |||
if (type.equals(StructuralVariantType.INS)) { | |||
positionB = positionA + 1; | |||
if (type == GATKSVVCFConstants.StructuralVariantAnnotationType.INS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem a little confusing to have ==
here and .equals
in other parts of the code like on lines 337 and 349 above. Is there a reason for that? Should we change it everywhere to be consistent?
Adds some flexibility to the allowed split-read strand annotations on SV records:
The changes affecting INS variants are commensurate with changing going into gatk-sv (see broadinstitute/gatk-sv#553).
Added tests for the new functionality and improved some unit test coverage for a few related cases.