Release Piscem v0.7.0 · COMBINE-lab/piscem

This release of piscem adds the ability to index decoy sequencing using the "distinguishing flanking k-mer" methodology described in Hjörleifsson and Sullivan et al.¹. This variant of considering decoy sequences that is optimized to work with pseudoalignment and pseudoalignment-like approaches where alignment scores are unavailable (unlike the approach of ², which is designed to work with selective-alignment).

The implementation in piscem adopts the terminology of "poison" k-mers — that is, the decoy sequence is used to create a separate table of poison k-mers whose presence will cause a read to be discarded, rather than to map to some target in the index. Poison k-mers are simply distinguishing flanking k-mers that belong to some decoy sequence, and hence their presence in a mapping should "poison" the mapping (i.e. lead to it being discarded).

To build a decoy-aware index, one simply passes the --decoy-paths argument to piscem build. This accepts a , separated list of FASTA files that will be used to generate the poison k-mer set. This will create a separate data structure (the poison table) that will be used to filter fragments that are potentially mapped spuriously to the index.

Likewise, when performing mapping, if a poison table has been built, it will be used by default. However, you can pass the --no-poison flag to map-bulk and map-sc to avoid considering poison k-mers, even if the index was constructed with a poison table.

Eldjárn Hjörleifsson, Kristján, et al. "Accurate quantification of single-nucleus and single-cell RNA-seq transcripts." bioRxiv (2022): 2022-12. ↩
Srivastava, Avi, et al. "Alignment and mapping methodology influence transcript abundance estimation." Genome biology 21.1 (2020): 1-29. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Piscem v0.7.0