Skip to content

Commit

Permalink
Merge branch 'dev'
Browse files Browse the repository at this point in the history
  • Loading branch information
konrad committed Apr 1, 2014
2 parents 295d4fa + de91557 commit a681032
Show file tree
Hide file tree
Showing 12 changed files with 94 additions and 88 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
v0.2.2 (2014-04-01)
- Improve documentation
- Modify README
- Extend Makefile
v0.2.2 (2014-03-31)
- Fix setup.py
- Typos fixing
Expand Down
2 changes: 1 addition & 1 deletion LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright (c) 2011-2013, Konrad Förstner <[email protected]>
Copyright (c) 2011-2014, Konrad Förstner <[email protected]>

Permission to use, copy, modify, and/or distribute this software for
any purpose with or without fee is hereby granted, provided that the
Expand Down
10 changes: 7 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ coverage:
coverage3 report

package:
python3.3 setup.py sdist
python3 setup.py sdist

package_to_pypi:
python setup.py sdist upload
python3 setup.py sdist upload
@echo "Go to https://pypi.python.org/pypi/READemption/"

html_doc:
Expand All @@ -29,7 +29,9 @@ readme_html:
pandoc --from=markdown --to=html README.md -o README.html

readme_rst:
pandoc --from=markdown --to=rst README.md -o README.rst
grep -v "^\[!" README.md | sed -e "1d" > README.md.tmp
pandoc --from=markdown --to=rst README.md.tmp -o README.rst
rm README.md.tmp

readme_clean:
rm -f README.tex README.html README.rst
Expand All @@ -41,11 +43,13 @@ pylint:
new_release:
@echo "* Please do this manually:"
@echo "* ------------------------"
@echo "* Create/checkout a release branch"
@echo "* Change bin/reademption"
@echo "* Change setup.py"
@echo "* Change docs/source/conf.py"
@echo "* Change CHANGELOG.txt"
@echo "* Commit changes e.g. 'git commit -m 'Set version to 0.2.0'"
@echo "* Tag the commit e.g. 'git tag -a v0.1.9 -m 'version v0.1.9''"
@echo "* Merge release into dev and master"
@echo "* After pushing generate a new release based on this tag at"
@echo " https://github.com/konrad/READemption/releases/new"
30 changes: 15 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,18 @@ About
-----

READemption is a pipeline for the computational evaluation of RNA-Seq
data. It was originally developed at the IMIB/ZINF to process dRNA-Seq
reads (as introduced by Sharma et al., Nature, 2010) originating from
bacterial samples. Meanwhile is has been extended to process data
generated in different experimental setups and originating from all
domains of life and is under active development. The subcommands which
are accessible viaq command-line interface cover read processing and
aligning, coverage plot generation, gene expression quantification as
well as differential gene expression analysis. READemption was applied
to analyze numerous data sets. In order to set up analyses quickly
data. It was originally developed to process dRNA-Seq reads (as
introduced by Sharma et al., Nature, 2010) originating from bacterial
samples. Meanwhile is has been extended to process data generated in
different experimental setups and from all domains of life. The
functions which are accessible via a command-line interface cover read
processing and aligning, coverage calculation, gene expression
quantification, differential gene expression analysis as well as
visualization. In order to set up and perform analyses quickly
READemption follows the principal of "convention over configuration":
Once the input files are copied into defined folders no further
Once the input files are copied/linked into defined folders no further
parameters have to be given. Still, READemption's behavior can be
adapted to specific needs of the user.
adapted to specific needs of the user by parameters.

Documentation
-------------
Expand All @@ -33,19 +32,20 @@ Short version (if you have all the requirements installed):
$ pip install READemption

[Long version](http://pythonhosted.org/READemption/installation.html)
(what are the requirements and how do you get them)
including a decryption of the requirements and how do you get them.

License
-------

[ICSL](https://en.wikipedia.org/wiki/ISC_license) - see LICENSE.txt
[ICSL](https://en.wikipedia.org/wiki/ISC_license)
(Internet Systems Consortium license ~ simplified BSD license) - see LICENSE.txt

Development
-----------

* If possible follow the principal of "convention over
configuration". This means input file are into a fixed location and
the result file are placed in fixed location.
configuration". This means input file are copied/linked into a fixed
location and the resulting files are placed in fixed locations.

* The classes should be path agnostic as far a possible. The controller
is taking care of that and calls them adequately.
Expand Down
31 changes: 16 additions & 15 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,18 @@ About
-----

READemption is a pipeline for the computational evaluation of RNA-Seq
data. It was originally developed at the IMIB/ZINF to process dRNA-Seq
reads (as introduced by Sharma et al., Nature, 2010) originating from
bacterial samples. Meanwhile is has been extended to process data
generated in different experimental setups and originating from all
domains of life and is under active development. The subcommands which
are accessible viaq command-line interface cover read processing and
aligning, coverage plot generation, gene expression quantification as
well as differential gene expression analysis. READemption was applied
to analyze numerous data sets. In order to set up analyses quickly
data. It was originally developed to process dRNA-Seq reads (as
introduced by Sharma et al., Nature, 2010) originating from bacterial
samples. Meanwhile is has been extended to process data generated in
different experimental setups and from all domains of life. The
functions which are accessible via a command-line interface cover read
processing and aligning, coverage calculation, gene expression
quantification, differential gene expression analysis as well as
visualization. In order to set up and perform analyses quickly
READemption follows the principal of "convention over configuration":
Once the input files are copied into defined folders no further
Once the input files are copied/linked into defined folders no further
parameters have to be given. Still, READemption's behavior can be
adapted to specific needs of the user.
adapted to specific needs of the user by parameters.

Documentation
-------------
Expand All @@ -32,19 +31,20 @@ Short version (if you have all the requirements installed):
$ pip install READemption

`Long version <http://pythonhosted.org/READemption/installation.html>`__
(what are the requirements and how do you get them)
including a decryption of the requirements and how do you get them.

License
-------

`ICSL <https://en.wikipedia.org/wiki/ISC_license>`__ - see LICENSE.txt
`ICSL <https://en.wikipedia.org/wiki/ISC_license>`__ (Internet Systems
Consortium license ~ simplified BSD license) - see LICENSE.txt

Development
-----------

- If possible follow the principal of "convention over configuration".
This means input file are into a fixed location and the result file
are placed in fixed location.
This means input file are copied/linked into a fixed location and the
resulting files are placed in fixed locations.

- The classes should be path agnostic as far a possible. The controller
is taking care of that and calls them adequately.
Expand All @@ -64,3 +64,4 @@ Development
- hotfix branches - branched off from master and merged back into
dev and master


4 changes: 2 additions & 2 deletions bin/reademption
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ import argparse
from reademptionlib.controller import Controller

__author__ = "Konrad Foerstner <[email protected]>"
__copyright__ = "2011-2013 by Konrad Foerstner <[email protected]>"
__copyright__ = "2011-2014 by Konrad Foerstner <[email protected]>"
__license__ = "ISC license"
__email__ = "[email protected]"
__version__ = "0.2.2"
__version__ = "0.2.3"

def main():
parser = argparse.ArgumentParser()
Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
# The short X.Y version.
version = '0.2'
# The full version, including alpha/beta/rc tags.
release = '0.2.2'
release = '0.2.3'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
40 changes: 19 additions & 21 deletions docs/source/example_analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,16 @@ Performing an example analysis
==============================

Here you will be guided trough a small example analysis using a
publicly available RNA-Seq data set. We will use a data set from NCBI
GEO that was part of a publication by `Kröger et
publicly available RNA-Seq from NCBI GEO that was part of a
publication by `Kröger et
al. <http://www.ncbi.nlm.nih.gov/pubmed/24331466>`_. This is a
transcriptome analysis of *Salmonella* Typhimurium SL1344 in different
conditions.

We will generate several output files in different formats. The CSV
(tabular separated plain text files) files can be opened with any
spreadsheet program like `LibreOffice <https://www.libreoffice.org/>`_
or Excel. For inspecting the mappings (in BAM format) and coverage
files (wiggle format) you can use a genome browser for example `IGB
<http://bioviz.org/igb/>`_ or `IGV
conditions. We will generate several output files in different
formats. The CSV (tabular separated plain text files) files can be
opened with any spreadsheet program like `LibreOffice
<https://www.libreoffice.org/>`_ or Excel. For inspecting the mappings
(in BAM format) and coverage files (wiggle format) you can use a
genome browser for example `IGB <http://bioviz.org/igb/>`_ or `IGV
<https://www.broadinstitute.org/igv/home>`_.

Generating a project
Expand Down Expand Up @@ -145,10 +143,10 @@ created which can be found in
``READemption_analysis/output/align/reports_and_stats/``. It contains
several mapping statistics for example how many reads are successfully
aligned in total and how many were aligned to each replicon. We see
that more than 98 % are mapped for each library. Sorted and indexed
alignements in BAM format are stored in
that more than 98 % of the reads are mapped for each library. Sorted
and indexed alignements in BAM format are stored in
``READemption_analysis/output/align/alignments``. We could load them
in a genome browser but instead we continue with the next step.
into a genome browser but instead we continue with the next step.


Generating coverage files
Expand All @@ -162,12 +160,12 @@ normalizations we use the subcommand ``coverage``.
$ reademption coverage -p 4 READemption_analysis

The sets are stored in subfolder of
``READemption_analysis/output/coverage/``. The most often set is
stored in ``coverage-tnoar_min_normalized``. Here the coverages are
normalized by the total number of aligned reads (TNOAR) of the
individual library and then multiplied by the lowest TNOAR value of
all libs. These files could be inspected for differential RNA-Seq
(dRNA-Seq - comparing libraries with and without Terminator
``READemption_analysis/output/coverage/``. The most oftenly used set
is stored in ``coverage-tnoar_min_normalized``. Here the coverage
values are normalized by the total number of aligned reads (TNOAR) of
the individual library and then multiplied by the lowest TNOAR value
of all libraries. These files could be inspected for differential
RNA-Seq (dRNA-Seq - comparing libraries with and without Terminator
Exonuclease treatment) data in order to determine transcriptional
start sites. They can be loaded in common genome browsers like `IGB
<http://bioviz.org/igb/>`_ or `IGV
Expand Down Expand Up @@ -233,7 +231,7 @@ Create plots
------------

Finally we generate plots that visualize the results of the different
steps. ``viz_align`` will create histograms of the read length
steps. ``viz_align`` creates histograms of the read length
distribution for the untreated and treated reads (saved in
``READemption_analysis/output/viz_align/``).

Expand All @@ -244,7 +242,7 @@ distribution for the untreated and treated reads (saved in
``viz_gene_quanti`` visualizes the gene wise countings. In our example
you will see that - as expected - the replicates are more similar to
each other than to the libs of the other condition. It also generates
bar plot that show the distribution of reads inside the different RNA
bar plots that show the distribution of reads inside the different RNA
classes.

::
Expand Down
33 changes: 15 additions & 18 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,21 @@ Table of content
READemption in a nutshell
=========================

*READemption* is a pipeline for the computational evaluation of
RNA-Seq data. It was originally developed at the `IMIB/ZINF
<http://www.imib-wuerzburg.de/>`_ to process dRNA-Seq reads (as
introduced by Sharma *et al.*, Nature, 2010 (`Pubmed
READemption is a pipeline for the computational evaluation of
RNA-Seq data. It was originally developed to process dRNA-Seq reads
(as introduced by Sharma et al., Nature, 2010 (`Pubmed
<http://www.ncbi.nlm.nih.gov/pubmed/20164839>`_)) originating from
bacterial samples. Meanwhile is has been extended to process data
generated in different experimental setups and originating from all
domains of life and is under `active development
<https://github.com/konrad/READemption>`_. The `subcommands
<subcommands.html>`_ which are provided by command-line interface
cover read processing and aligning, coverage plot generation, gene
expression quantification as well as differential gene expression
analysis. READemption was applied to analyze numerous data sets. In order to
set up analyses quickly READemption follows the principal of *convention
over configuration*: Once the input files are copied into defined
folders no further parameters have to be given. Still, READemption's
behavior can be adapted to specific needs of the user. This tools is
available as open source under open source license `ICSL
<https://en.wikipedia.org/wiki/ISC_license>`_.
generated in different experimental setups and from all domains of
life. The `functions <subcommands.html>`_ which are accessible via a
command-line interface cover read processing and aligning, coverage
calculation, gene expression quantification, differential gene
expression analysis as well as visualization. In order to set up and
perform analyses quickly READemption follows the principal of
*convention over configuration*: Once the input files are
copied/linked into defined folders no further parameters have to be
given. Still, READemption's behavior can be adapted to specific needs
of the user by parameters.

Download
========
Expand Down Expand Up @@ -63,4 +59,5 @@ Konrad U. Förstner, Jörg Vogel, Cynthia M. Sharma; (submitted).
Contact
=======

For question and requests feel free to contact Konrad Förstner <[email protected]>
For question and requests feel free to contact `Konrad Förstner
<http://konrad.foerstner.org/>`_ <[email protected]>
7 changes: 4 additions & 3 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ version. Also Python 2.7 or earlier Python 3 version can be used if
the backported library `futures
<https://pypi.python.org/pypi/futures>`_ is installed. In any case,
the third party packages `pysam <https://code.google.com/p/pysam>`_ as
well as `setuptool <https://pypi.python.org/pypi/setuptools>`_ and
well as `setuptools <https://pypi.python.org/pypi/setuptools>`_ and
`pip <http://www.pip-installer.org>`_ should be available on the
system in order to make the installation easy. READemption uses the
short read mapper `segemehl
Expand Down Expand Up @@ -60,14 +60,15 @@ Some comments:
tar xzf segemehl_0_1_7.tar.gz
cd segemehl_*/segemehl/ && make && cd ../../

Copying it to a location that is part of the ``PATH`` e.g ``/usr/bin/`` ...
Copying the executable to a location that is part of the ``PATH`` e.g
``/usr/bin/`` ...

::

sudo cp segemehl_0_1_7/segemehl/segemehl.x /usr/bin/segemehl.x
sudo cp segemehl_0_1_7/segemehl/lack.x /usr/bin/lack.x

... or the bin folder of you home directory::
... or the bin folder of your home directory::

mkdir ~/bin
cp segemehl_0_1_7/segemehl/segemehl.x ~/bin
Expand Down
13 changes: 7 additions & 6 deletions docs/source/subcommands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -168,12 +168,13 @@ positions. To turn off this behavior use
gene_quanti
-----------

With ``gene_quanti`` the number of reads to each annotation entry is
counted and the results are combined in tables. At least one GGF file
with the annotations have to be placed in ``input/annotations``. The
sequence ID of the sequenced must be precisely the same as the IDs
used in the reference sequence FASTA files. To specify the feature
classes (e.g. CDS, gene, rRNA, tRNA) that should be quantified the
With ``gene_quanti`` the number of reads overlapping with each of the
annotation entries is counted and the results are combined in
tables. At least one GGF3 file with annotations has to be placed in
``input/annotations``. The sequence ID of the sequenced must be
precisely the same as the IDs used in the reference sequence FASTA
files. To specify the feature classes (the third column in the GFF3
file e.g. CDS, gene, rRNA, tRNA) that should be quantified the
parameter ``--features`` can be used. Otherwise countings for all
annotation entries are generated. Per default sense and anti-sense
overlaps are counted and separately listed.
Expand Down
6 changes: 3 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@

setup(
name='READemption',
version='0.2.2',
version='0.2.3',
packages=['reademptionlib', 'tests'],
author='Konrad U. Förstner',
author_email='[email protected]',
description='READemption - A RNA-Seq Analysis Pipeline',
description='A RNA-Seq Analysis Pipeline',
url='',
install_requires=[
"pysam >= 0.7.7"
],
scripts=['bin/reademption'],
license='LICENSE.txt',
license='ISC License (ISCL)',
long_description=open('README.rst').read(),
classifiers=[
'License :: OSI Approved :: ISC License (ISCL)',
Expand Down

0 comments on commit a681032

Please sign in to comment.