Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cutadapt hangs when input is fastq.gz #1700

Closed
stephenturner opened this issue Dec 16, 2016 · 8 comments
Closed

cutadapt hangs when input is fastq.gz #1700

stephenturner opened this issue Dec 16, 2016 · 8 comments

Comments

@stephenturner
Copy link
Contributor

Trying to run the small rna-seq pipeline with a template that looks like this:

upload:
  dir: ../final
details:
  - analysis: smallRNA-seq
    algorithm:
      aligner: star
      # change adapter according project
      adapters: ["TGGAATTCTCGGGTGC"]
      expression_caller: [trna, seqcluster, mirdeep2]
      species: mmu
    genome_build: mm10

Runs for days at the trimming step:

[2016-12-16T14:56Z] /path/share/bcbio/anaconda/bin/cutadapt --adapter=TGGAATTCTCGGGTGC --untrimmed-output=/path/work/trimmed/wt4/wt4.fragments.fastq.gz -o /path/work/trimmed/wt4/tx/tmpSWigfG/wt4.clean.fastq.gz -m 17 --overlap=8 /path/wt4.fastq.gz --too-short-output /path/work/trimmed/wt4/wt4.short.fastq.gz | tee > /path/work/trimmed/wt4/wt4.log

I tried this command directly at the command line, and it hangs forever without doing anything. Then gunzip'd the input file (also didn't gz the outputs) and it worked fine.

Possibly related to marcelm/cutadapt#137?

@lpantano
Copy link
Collaborator

Hi,

sorry about that problem.

If you tried just a simple version it hangs as well? (mainly without printing the untrimmed file)

/path/share/bcbio/anaconda/bin/cutadapt --adapter=TGGAATTCTCGGGTGC -o /path/work/trimmed/wt4/tx/tmpSWigfG/wt4.clean.fastq.gz -m 17 --overlap=8 /path/wt4.fastq.gz  | tee > /path/work/trimmed/wt4/wt4.log

If this simpler version work I can add some options to do that instead.

If the only solution is to gunzip the input I would do that prior calling bcbio for now, to get the pipeline running until I have time to reproduce better.

Let me know!

thanks!

@roryk
Copy link
Collaborator

roryk commented Dec 19, 2016

Thanks Stephen,

Sorry about that, working on this now.

roryk added a commit to roryk/bioconda-recipes that referenced this issue Dec 19, 2016
@roryk roryk closed this as completed in 013eb2d Dec 19, 2016
@roryk
Copy link
Collaborator

roryk commented Dec 19, 2016

Hi Stephen,

I tracked down this bug finally-- there is a bug in xopen 0.1.0 that hangs opening gzipped files on some systems. The newer version of cutadapt requires xopen 0.1.1 but that dependency hasn't been updated in the release version yet. I switched the bioconda recipe to require 0.1.1 and that seems to fix the problem.

I also simplified the cutadapt call to support the trimming paired files in one go rather than using the old two-pass method, so this step should be a little bit faster as well.

If you upgrade with bcbio_nextgen.py upgrade -u development --tools it should pull in xcode 0.1.1 and straighten it out.

@roryk
Copy link
Collaborator

roryk commented Dec 19, 2016

Hi Stephen,

Yup that will pull in the fixed xopen (or it should if bioconda has the package built already) and it should get you over this bump. If xopen is still 0.1.0 that means the conda package isn't done building but you can install xopen 0.1.1 manually using the conda bcbio installs with conda install -c bioconda xopen which will give you 0.1.1 which won't hang on these files.

@stephenturner
Copy link
Contributor Author

Beautiful. I just installed xopen and it worked fine with the latest stable release.

@roryk
Copy link
Collaborator

roryk commented Dec 20, 2016

Thanks for opening the issue and following up Stephen!

@adgarbuzov
Copy link

I haven't been able to use trim-galore/cut adapt using bioconda because of this problem. I get the error:
pkg_resources.DistributionNotFound: The 'xopen>=0.3.2' distribution was not found and is required by cutadapt

And when I download xopen using conda install -c bioconda xopen it does not fix the problem.

Though when I type
which open
it says it's not located in bin. So is something going wrong with the xopen install?

@roryk
Copy link
Collaborator

roryk commented Jul 21, 2018

Heya--we don't actually use cutadapt within bcbio anymore, so I think you might be posting on the wrong repo. You might have a better bet here: https://github.com/marcelm/cutadapt

That said, these types of problems are usually because there is another python installation leaking into your environment. Do you have your PYTHONPATH or PYTHONHOME set? If so, unsetting those might fix your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants