Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't find left Boundary #813

Closed
cliftonlewis opened this issue Nov 28, 2022 · 10 comments · Fixed by #817
Closed

Can't find left Boundary #813

cliftonlewis opened this issue Nov 28, 2022 · 10 comments · Fixed by #817

Comments

@cliftonlewis
Copy link

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)?
alevin
Describe the bug
A clear and concise description of what the bug is.
I am running the following command on some sci-rna-seq3 samples and it seems to not work as expected.
salmon alevin -i af_splici/dm6_splici_idx/ -l ISR -1 data/SRR17122012_1.fastq -2 data/SRR17122012_2.fastq -o SRR17122012 --tgMap transcriptome_splici_fl52/transcriptome_splici_fl52_t2g.tsv -p 28 --sciseq3 --justAlign
I then took the output into alevin-fry to create a generate-permit-list and it gives me the error that salmon hasn't added the extra bps to account for the chemistry
"thread 'main' panicked at 'assertion failed: (left == right)
left: 20,
right: 19: found barcodes of different lenghts 20 and 19', src/cellfilter.rs:203:13
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace"
Thus I re-ran salmon alevin without the --justAlign flag and it seems to hit a different error
"### alevin (dscRNA-seq quantification) v1.9.0

[ program ] => salmon

[ command ] => alevin

[ index ] => { af_splici/dm6_splici_idx/ }

[ libType ] => { ISR }

[ mates1 ] => { data/SRR17122012_1.fastq }

[ mates2 ] => { data/SRR17122012_2.fastq }

[ output ] => { SRR17122012 }

[ tgMap ] => { transcriptome_splici_fl52/transcriptome_splici_fl52_t2g.tsv }

[ threads ] => { 28 }

[ sciseq3 ] => { }

[2022-11-28 21:13:57.772] [alevinLog] [info] Found all transcripts to gene mappings
[2022-11-28 21:13:57.781] [alevinLog] [info] Processing barcodes files (if Present)

processed 10 Million barcodes

[2022-11-28 21:14:01.454] [alevinLog] [info] Done barcode density calculation.
[2022-11-28 21:14:01.454] [alevinLog] [info] # Barcodes Used: 1 / 10285890.
[2022-11-28 21:14:01.455] [alevinLog] [error] Can't find left Boundary.
Please Report this issue on github."

Specifically, please provide at least the following information:

  • Which version of salmon was used? salmon (1.9.0); alevin-fry (0.8.0)
  • How was salmon installed (compiled, downloaded executable, through bioconda)? Conda install
  • Which reference (e.g. transcriptome) was used? Drosophila melanogaster (BDGP6.32 (GCA_000001215.4))

Desktop (please complete the following information):

  • OS: Ubuntu Linux

Thanks in advance for all your help regarding this!

@rob-p
Copy link
Collaborator

rob-p commented Nov 29, 2022

Pinging @Gaura here — any idea what might be causing the inability to properly process the barcodes?

@cliftonlewis — would you be able to share a sampling of the reads for us to examine and help debug with?

@Gaura
Copy link
Collaborator

Gaura commented Nov 30, 2022

@cliftonlewis: could you tell us version of alevin-fry are you using?
@rob-p: The cellbarcode length should be 21. It is variable b/w 19 or 20 so AC or A is added to make it 21. It could be the odd-even error we saw on previous version of alevin-fry. Wrt the run without --justAlign, I would need to take a closer look.

@cliftonlewis
Copy link
Author

Hi,

Thanks for your rapid reply. So I am using version 0.8.0. All the fastq files were deposited under SRP349178

Here is just a snippet, is this informative? What else can I supply to help you guys.

@SRR17123283.1 1 length=34
TAGAANCAATCAGAGCGGGTTGGTCATCTCNNCN
+SRR17123283.1 1 length=34
AAA/A#EEEEEAE<EEEEEEEEEEEEEEEE##E#
@SRR17123283.2 2 length=34
AGGCGNAATCAGAGATCCTAGGGCCGCCGCNNAN
+SRR17123283.2 2 length=34
AAAAA#EEEEEEEAEEEEEEEEEEEEE//A##/#
@SRR17123283.3 3 length=34
GTCGTNACTCAGAGAATATTCTTTCTAATAGNAN
+SRR17123283.3 3 length=34
/6AA/#EAEEAEEAEEEEE/AEEEAEAEEEE#E#

@rob-p
Copy link
Collaborator

rob-p commented Dec 1, 2022

@Gaura: So, it seems the problem is during barcode extraction and mapping — we're not even getting to the step where alevin-fry is being run, right? Does anything look suspicious in the read snippet above (or in the corresponding read set that @cliftonlewis mentioned)?

@cliftonlewis
Copy link
Author

Hi, just wondering if you guys managed to understand what the source of my error was? I don't know if I just don't know how to work the sci-rna-seq3 data

@Gaura
Copy link
Collaborator

Gaura commented Dec 9, 2022

Hi @cliftonlewis,

Sorry for the delay. I tested it on another file and it worked fine. I would like to look at some info from your file. Could you:

  1. Post the salmon log of the first run, the one that you did with --justAlign
  2. There should be a map.rad file in your output directory (SRR17122012). Can you run the command: alevin-fry view --rad map.rad > rad.txt and share the rad.txt file?

Thanks.

@Gaura
Copy link
Collaborator

Gaura commented Dec 9, 2022

Hi @cliftonlewis and @rob-p,

I figured out the issue. It was with the length check in the extract barcode code. I can push a fix today.

@Gaura
Copy link
Collaborator

Gaura commented Dec 9, 2022

Thanks for reporting it @cliftonlewis. I have tested the fix and it works both with and without rad mode (--justAlign). The solution is in pr #817.

@Gaura Gaura linked a pull request Dec 10, 2022 that will close this issue
@cliftonlewis
Copy link
Author

That's cool. Do you know when the change would become a commit?

@rob-p
Copy link
Collaborator

rob-p commented Feb 24, 2023

Sorry I missed this! It's been on that branch (and committed) since Gaurav's PR. It's now been merged into master and included in the latest release (1.10.0).

@rob-p rob-p closed this as completed Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants