-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
salmon taking unexpectedly long to map reads #830
Comments
Hi @charlesfoster, Thanks for opening the issue. The second run looks like it ends before there is any information about mapped reads. Have mappings started to be reported at that point? I wonder if there is some issue related to the loading of the index. I have a few suggestions that may be worthwhile to try:
Also, if you can share a set of problematic reads (or even a subset of them that will reproduce the extreme slowness problem) privately, that would be very helpful in debugging. In addition to trying to debug what's going on here, I'd probably also try running them through piscem. While this isn't yet an actual substitute for salmon, it will help isolate if the problem is directly related to the index or something else. Thanks! |
Hi @rob-p, Thanks for the speedy reply! There are definitely some strange things going on here. I can confirm that the second run (and the others that timed out) didn't produce any information about mapping. The outdir only contained empty subdirs + an empty log file:
Firstly, I downloaded the pre-built salmon index from refgenie using
This pre-built index does appear to be decoy-aware:
Secondly, I created a new transcriptome-only salmon index ( Seems like there is something very wrong with the 'gentrome.fa' file that's being created by
I'll re-run (a) using the refgenie salmon index specified; (b) with the
Considering this, would it still be useful to have access to the reads? I've got the green light to share them if need be. If so, what's a good contact address to share a OneDrive link? Thanks! p.s. something else odd that I can dig into further later if need be is that the singularity version of salmon created an index in about 5 minutes, yet the conda version has been creating the index for nearly 20 minutes so far with no change... |
Hi Charles, Thanks for the super-detailed response! This behavior is very interesting indeed. I still think having the ability to look at the reads might be useful. You could share a link with Thanks! |
Cool- I've created the issue over at nf-core (nf-core/rnaseq#948), and have emailed you a link to the reads. Let's hope this is all trivial! Cheers, |
Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)?
salmon
Describe the bug
I'm working with 15 samples, with ~5Gb total reads per sample (90,000,000 to 100,000,000 reads, ~75 bp reads). I've tried running these samples through the
nf-core/rnaseq
pipeline, but the pipeline took an age to run before dying at thesalmon quant
steps. Some samples finished in about 12 minutes. Others timed out after 8+ hours.(taken from the terminal as the logfile is empty, and the current time is 12:54 pm = >3 hr run time so far)
To Reproduce
I ran the following command:
nf-core/rnaseq
: via singularity; while running manually to troubleshoot: conda.Expected behavior
All samples with similar numbers of reads using the same index to finish in roughly the same amount of time.
Screenshots
If applicable, add screenshots or terminal output to help explain your problem.
Desktop (please complete the following information):
Additional context
I can't share these reads publicly but might be able to share personally (but I'd have to ask first)
I tried using the suggestion from previous issues (e.g. --hitFilterPolicy BOTH) to use
--hitFilterPolicy BOTH
but it didn't seem to help.I tried trimming polyX tails with
fastp
but that didn't help either.To try and see if
salmon
was causing the issues, I thought I'd trykallisto
as a similar-ish comparison.kallisto
mapped the reads with 100 EM bootstraps in about 25 minutes. Commands:Thanks for the tool - hopefully this is easy to sort out!
The text was updated successfully, but these errors were encountered: