Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in the Info file with cutadapt 1.9.dev2 when using multiple adapters #139

Closed
komalsrathi opened this issue Jul 23, 2015 · 3 comments
Closed

Comments

@komalsrathi
Copy link

Hi,

This is my command:

cutadapt --untrimmed-output=sample_untrimmed.fa -a short_adpt=GTTTTAGAGCTAG -g long_adpt=GTGGAAAGGACGAAACACCG -n 2 -e 0.1 -o sample_{name}_trimmed.fa --info-file=sample_info.txt sample.reads.fa

I noticed that there are changes from dev1 to dev2 when it comes to info file. Please confirm.

First in the older version, the reads that were untrimmed were not reported in the info file, but with the new version, the untrimmed reads are reported in the info file where the second column has a value of -1. That is not really an issue, but the issue is that now it it not reporting the reads that are cleaved by both long and short adapters clearly, for e.g.

In the fasta file:

>HWI-D00712:1134:C6NMYANXX:5:1101:3375:2398 1:N:0:AAAGTG
AGTGGAAAGGACGAAACACCGCGTCTGTCGATTCATCGAGTGTTTTAGAG

So, the read has been cleaved like this:

AGTGGAAAGGACGAAACACCGCGTCTGTCGATTCATCGAGTGTTTTAGAG
A\\GTGGAAAGGACGAAACACCG\\CGTCTGTCGATTCATCGAGT\\GTTTTAGAG

But in the info file:

grep HWI-D00712:1134:C6NMYANXX:5:1101:3375:2398 sample_info.txt
HWI-D00712:1134:C6NMYANXX:5:1101:3375:2398 1:N:0:AAAGTG 0   20  29  CGTCTGTCGATTCATCGAGT    GTTTTAGAG       short_adpt

So as you can see, it should report both long_adpt and short_adpt in the info file, whereas it reports only one of the two adapters.

I also have an example where I used dev1 and dev2 and generated info files:

grep HWI-D00712:1098:C6LN6ANXX:5:1101:1927:2104 dev1_info.txt 
HWI-D00712:1098:C6LN6ANXX:5:1101:1927:2104 1:N:0:ACAGTG 0   7   27  CGAACTC GTGGAAAGGACGAAACACCG    TCGATGCCATCATACTCCGTGTT long_adpt
HWI-D00712:1098:C6LN6ANXX:5:1101:1927:2104 1:N:0:ACAGTG 0   20  23  TCGATGCCATCATACTCCGT    GTT     short_adpt

grep HWI-D00712:1098:C6LN6ANXX:5:1101:1927:2104 dev2_info.txt 
HWI-D00712:1098:C6LN6ANXX:5:1101:1927:2104 1:N:0:ACAGTG 0   20  23  TCGATGCCATCATACTCCGT    GTT     short_adpt

I have to get some counts and it is very difficult to find which reads are cleaved by both and which ones are uniquely cleaved by one or the other adapter.

Thanks,
Komal

@komalsrathi komalsrathi changed the title Issue with cutadapt 1.9.dev2 when using multiple adapters Issue in the Info file with cutadapt 1.9.dev2 when using multiple adapters Jul 23, 2015
@marcelm
Copy link
Owner

marcelm commented Jul 24, 2015

You are right. When I added the lines with the -1 to fix issue #95, I also changed the output inadvertently to the one you are seeing now, where you cannot get proper statistics from it. I have no time to fix this right now, but I have prepared a Git branch for you that includes the fix for the issue #137 that you reported, but not the changes to the info file. The branch is called 'nohang'. It seems you are familiar with Git, but just in case you don’t know how to get the branch: Just run git pull and then git checkout nohang in your current copy of the repository. I have changed the version number in that branch to 1.9.dev3.

@komalsrathi
Copy link
Author

No worries, I had a zip file for dev1 and I will just go back to the previous version as using compressed files in the output are anyway taking longer than using uncompressed files. Thanks!

@marcelm
Copy link
Owner

marcelm commented Jul 29, 2015

This has been fixed in cutadapt 1.8.3.

@marcelm marcelm closed this as completed Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants