You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I understand that salmon does not use read quality scores when performing alignment/quantitation but, as it stands, running with --writeMappings it also doesn't output the qualities in its SAM-format output.
That may not impact most downstream applications of the data but it does limit some. I'd be particularly interested in using reads that have been aligned and assigned to transcripts with an EM algorithm to investigate RNA editing. The most advanced methods of doing this integrate the quality of base calls from the sequencer into the models and, as such, cannot currently run on the data output.
Although it may save a few CPU cycles in not reading and outputting the qualities - I can understand the choice when running with default settings - when running with --writeMappings that nominal difference in speed needs to be weighed up against downstream utility. Could this be looked into?
Cheers! George
The text was updated successfully, but these errors were encountered:
This will require some upstream changes to the SAM writing code, but I don't think it should be too hard. We could add this to the roadmap. I'll tag this as an enhancement.
Thanks for that. It would be so handy to have this! As it stands, I've written a Python script to intercept the collated output and add back in the qualities from the FASTQs but it's not exactly a speedy process ...
Thanks for your help and for all the work with Salmon!
I understand that salmon does not use read quality scores when performing alignment/quantitation but, as it stands, running with
--writeMappings
it also doesn't output the qualities in its SAM-format output.That may not impact most downstream applications of the data but it does limit some. I'd be particularly interested in using reads that have been aligned and assigned to transcripts with an EM algorithm to investigate RNA editing. The most advanced methods of doing this integrate the quality of base calls from the sequencer into the models and, as such, cannot currently run on the data output.
Although it may save a few CPU cycles in not reading and outputting the qualities - I can understand the choice when running with default settings - when running with
--writeMappings
that nominal difference in speed needs to be weighed up against downstream utility. Could this be looked into?Cheers! George
The text was updated successfully, but these errors were encountered: