-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug with MateDistantReadFilter not being configurable #7701
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👎
public MateDistantReadFilter() { | ||
} | ||
|
||
public MateDistantReadFilter( final int minMappingQualityScore ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a mapping quality score.
In all seriousness though, fix the typo and merge :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Sorry, but this bug still isn't fixed as of v4.2.6.1. Reproduce as follows:
Instead of a run-time exception (as in v4.2.5.0), HaplotypeCaller simply produces no variant calls at all. Expected behavior would be to exclude paired-end mappings whose TLEN exceeds the parameterized value. Perhaps there is an implementation bug, unrelated to the original problem, that contains faulty logic for doing this. Thanks... |
Hi @RWilton i'm sorry to hear that. I suspect its unrelated given how simple this PR is but its quite possible that filter has been broken since evidently nobody was able to use and adjust it for a long time. This is the logic here that the filter uses: That logic is correct for what the filter is doing. It should be noted that the filter does the opposite of what you expect it to (since its intended for our SV pipeline) in that it filters out all reads that are NOT having distant mates. This means if you try to run HaplotypeCaller with this setting you will be throwing away every read pair EXCEPT the distant ones which results in mostly no reads. We should perhaps rename this filter to be a little less confusing. |
Well, that explains that, sort of. The code snippet you're providing looks like it ought to do what you say it does (i.e., the mates have to be paired, not unmapped, mapped to the same contig, and have a difference in their start positions that is at least But there are two problems with this:
I hate to say this, but I think this parameter needs some attention. Its potential utility with HaplotypeCaller seems evident to me (i.e., it would be good to be able to exclude outliers with unreasonable TLENs) but its implementation and frugal documentation make it unusable in practice. |
@RWilton I agree that this is a confusing filter and maybe should be renamed... As I have said this is a specific utility is for SVs where they might only want to look at long range read pairs not HaplotypeCaller. In that context I would say that the nuance about TLEN vs left aligned start position is a relatively small one. I agree with you that it would be useful to have some functionality for filtering out highly distant read pairs form our short variant calling as an option. I would have to think on whether its better to refactor this filter to be more general or to make a second filter that is the reverse of this one. @RWilton can you make an issue ticket and describe what you would want out of such a filter for HaplotypeCaller? I would put some thought into the case where the TLEN field has not been populated since it often isn't (especially since the cases where TLEN might not get computed overlaps with discordant/distant mates which are part of the target for such a filter). |
Since the intent and implementation of this filter are not essential to my work -- I can simply filter pairs on TLEN prior to variant calling -- perhaps it's best to just let this one be. Thanks for the discussion and for helping me think things through. |
Fixes #7696