Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command takes forever, hangs and doesn't finish when using fastq.gz extension in --untrimmed-ouput #137

Closed
komalsrathi opened this issue Jul 20, 2015 · 7 comments

Comments

@komalsrathi
Copy link

Hi,

The following is my code:

bin/cutadapt --untrimmed-output=output_untrimmed.fastq.gz -a short_adpt=GTTTTAGAGCTAG -g long_adpt=GTGGAAAGGACGAAACACCG --times=2 --error-rate=0.1 --output=output_{name}_trimmed.fastq.gz --info-file=info.txt input.fastq.gz

This command shows the following:

This is cutadapt 1.9.dev1 with Python 2.6.6
Command line parameters: --untrimmed-output=output_untrimmed.fastq.gz -a short_adpt=GTTTTAGAGCTAG -g long_adpt=GTGGAAAGGACGAAACACCG --times=2 --error-rate=0.1 --output=output_{name}_trimmed.fastq.gz --info-file=info.txt input.fastq.gz
Trimming 2 adapters with at most 10.0% errors in single-end mode ...

As per my command line, my input file is .fastq.gz and I want all my outputs compressed to gz. But the command above takes forever and I have to kill it. After a while when I use top to see if the command is runnnig, it doesn't show anything which means cutadapt is not running anymore. Also no warnings or errors are thrown out until I kill the command. It just hangs in there.

When I kill it, these msgs pop up on the console:

^CTraceback (most recent call last):
  File "bin/cutadapt", line 10, in <module>
    cutadapt.main()
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/cutadapt/scripts/cutadapt.py", line 844, in main
    f.close()
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/cutadapt/filters.py", line 188, in close
    f.close()
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/cutadapt/xopen.py", line 45, in close
    retcode = self.process.wait()
  File "/usr/lib64/python2.6/subprocess.py", line 1302, in wait
    pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
  File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call
    return func(*args)
KeyboardInterrupt

I tried the same command, just changing the --untrimmed-output option from output_untrimmed.fastq.gz to output_untrimmed.fastq, and it runs well.

bin/cutadapt --untrimmed-output=output_untrimmed.fastq -a short_adpt=GTTTTAGAGCTAG -g long_adpt=GTGGAAAGGACGAAACACCG --times=2 --error-rate=0.1 --output=output_{name}_trimmed.fastq.gz --info-file=info.txt input.fastq.gz

However, I want all my outputs in .fastq.gz formats. Is there a work around for it?

@komalsrathi komalsrathi changed the title Command takes forever, doesn't complete and hangs Command takes forever when using gz files as input and ouput, hangs and doesn't finish Jul 20, 2015
@komalsrathi komalsrathi changed the title Command takes forever when using gz files as input and ouput, hangs and doesn't finish Command takes forever, hangs and doesn't finish when using fastq.gz extension in --untrimmed-ouput Jul 20, 2015
@marcelm
Copy link
Owner

marcelm commented Jul 20, 2015

Hi, thanks for the report. It seems the problem occurs when cutadapt is already finished with everything and then waits for the 'gzip' process that it spawns to finish. I am having a hard time reproducing this. Do you see the problem every time you run the above command?

@komalsrathi
Copy link
Author

Yes this is the problem every time when I run the command. But note that when I do not use .gz in --untrimmed-output, it runs just fine.

marcelm added a commit that referenced this issue Jul 21, 2015
This avoids a hang when writing to two or more gzip-compressed output files
in Python 2.6. This fixes issue #137. The relevant Python bug report is
http://bugs.python.org/issue12786
@marcelm
Copy link
Owner

marcelm commented Jul 21, 2015

The problem seems to be triggered only in Python 2.6 and it is described in http://bugs.python.org/issue12786 . I have committed a change that should fix it. Could you test it, please?

@komalsrathi
Copy link
Author

Hi,

I tested it again and it is still not finishing.

This is cutadapt 1.9.dev1 with Python 2.6.6
Command line parameters: -f fastq --untrimmed-output=untrimmed.fastq.gz -a short_adpt=GTTTTAGAGCTAG -g long_adpt=GTGGAAAGGACGAAACACCG -n 2 -e 0.1 -o trimmed-{name}.fastq.gz --info-file=info.txt sample.fastq.gz
Trimming 2 adapters with at most 10.0% errors in single-end mode ...
^CTraceback (most recent call last):
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/bin/cutadapt", line 10, in <module>
    cutadapt.main()
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/cutadapt/scripts/cutadapt.py", line 844, in main
    f.close()
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/cutadapt/filters.py", line 188, in close
    f.close()
  File "/fujfs/d3/home/komalr/tools/cutadapt-master/cutadapt/xopen.py", line 45, in close
    retcode = self.process.wait()
  File "/usr/lib64/python2.6/subprocess.py", line 1302, in wait
    pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
  File "/usr/lib64/python2.6/subprocess.py", line 462, in _eintr_retry_call
    return func(*args)

Although, now I am using --output and --untrimmed-output without the .gz extension and it is much faster.

@marcelm
Copy link
Owner

marcelm commented Jul 22, 2015

That is strange. I was able to reproduce the exact problem, committed a change, and did not have the problem anymore. I would be grateful if you could try it one last time, just to make sure you actually are using the most recent version from Git. I have also just changed the version number as well, so please pull the most recent version and make sure that you then see version 1.9.dev2 instead of 1.9.dev1.

@komalsrathi
Copy link
Author

Hi Marcel,

I used the new update and it works perfectly now. Thanks!

@marcelm
Copy link
Owner

marcelm commented Jul 24, 2015

Nice, great to hear!

marcelm added a commit that referenced this issue Jul 24, 2015
This avoids a hang when writing to two or more gzip-compressed output files
in Python 2.6. This fixes issue #137. The relevant Python bug report is
http://bugs.python.org/issue12786
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants