Merge branch 'dev'

foerstner-lab · Apr 1, 2014 · a681032 · a681032
2 parents 295d4fa + de91557
commit a681032
Show file tree

Hide file tree

Showing 12 changed files with 94 additions and 88 deletions.
diff --git a/CHANGELOG.txt b/CHANGELOG.txt
@@ -1,3 +1,7 @@
+v0.2.2 (2014-04-01)
+- Improve documentation
+- Modify README
+- Extend Makefile
 v0.2.2 (2014-03-31)
 - Fix setup.py
 - Typos fixing

diff --git a/LICENSE.txt b/LICENSE.txt
@@ -1,4 +1,4 @@
-Copyright (c) 2011-2013, Konrad Förstner <[email protected]>
+Copyright (c) 2011-2014, Konrad Förstner <[email protected]>
 
 Permission to use, copy, modify, and/or distribute this software for
 any purpose with or without fee is hereby granted, provided that the

diff --git a/Makefile b/Makefile
@@ -6,10 +6,10 @@ coverage:
 	coverage3 report
 
 package:
-	python3.3 setup.py sdist
+	python3 setup.py sdist
 
 package_to_pypi:
-	python setup.py sdist upload
+	python3 setup.py sdist upload
 	@echo "Go to https://pypi.python.org/pypi/READemption/"
 
 html_doc:
@@ -29,7 +29,9 @@ readme_html:
 	pandoc --from=markdown --to=html README.md -o README.html
 
 readme_rst:
-	pandoc --from=markdown --to=rst README.md -o README.rst
+	grep -v "^\[!" README.md | sed -e "1d" > README.md.tmp
+	pandoc --from=markdown --to=rst README.md.tmp -o README.rst
+	rm README.md.tmp
 
 readme_clean:
 	rm -f README.tex README.html README.rst
@@ -41,11 +43,13 @@ pylint:
 new_release:
 	@echo "* Please do this manually:"
 	@echo "* ------------------------"
+	@echo "* Create/checkout a release branch"
 	@echo "* Change bin/reademption"
 	@echo "* Change setup.py"
 	@echo "* Change docs/source/conf.py"
 	@echo "* Change CHANGELOG.txt"
 	@echo "* Commit changes e.g. 'git commit -m 'Set version to 0.2.0'"	
 	@echo "* Tag the commit e.g. 'git tag -a v0.1.9 -m 'version v0.1.9''"
+	@echo "* Merge release into dev and master"
 	@echo "* After pushing generate a new release based on this tag at"
 	@echo "  https://github.com/konrad/READemption/releases/new"
diff --git a/README.md b/README.md
@@ -6,19 +6,18 @@ About
 -----
 
 READemption is a pipeline for the computational evaluation of RNA-Seq
-data. It was originally developed at the IMIB/ZINF to process dRNA-Seq
-reads (as introduced by Sharma et al., Nature, 2010) originating from
-bacterial samples. Meanwhile is has been extended to process data
-generated in different experimental setups and originating from all
-domains of life and is under active development. The subcommands which
-are accessible viaq command-line interface cover read processing and
-aligning, coverage plot generation, gene expression quantification as
-well as differential gene expression analysis. READemption was applied
-to analyze numerous data sets. In order to set up analyses quickly
+data. It was originally developed to process dRNA-Seq reads (as
+introduced by Sharma et al., Nature, 2010) originating from bacterial
+samples. Meanwhile is has been extended to process data generated in
+different experimental setups and from all domains of life. The
+functions which are accessible via a command-line interface cover read
+processing and aligning, coverage calculation, gene expression
+quantification, differential gene expression analysis as well as
+visualization. In order to set up and perform analyses quickly
 READemption follows the principal of "convention over configuration":
-Once the input files are copied into defined folders no further
+Once the input files are copied/linked into defined folders no further
 parameters have to be given. Still, READemption's behavior can be
-adapted to specific needs of the user.
+adapted to specific needs of the user by parameters.
 
 Documentation
 -------------
@@ -33,19 +32,20 @@ Short version (if you have all the requirements installed):
     $ pip install READemption
 
 [Long version](http://pythonhosted.org/READemption/installation.html)
-(what are the requirements and how do you get them)
+including a decryption of the requirements and how do you get them.
 
 License
 -------
 
-[ICSL](https://en.wikipedia.org/wiki/ISC_license) - see LICENSE.txt
+[ICSL](https://en.wikipedia.org/wiki/ISC_license) 
+(Internet Systems Consortium license ~ simplified BSD license) - see LICENSE.txt
 
 Development
 -----------
 
 * If possible follow the principal of "convention over
-  configuration". This means input file are into a fixed location and
-  the result file are placed in fixed location.
+  configuration". This means input file are copied/linked into a fixed
+  location and the resulting files are placed in fixed locations.
 
 * The classes should be path agnostic as far a possible. The controller
   is taking care of that and calls them adequately.

diff --git a/README.rst b/README.rst
@@ -2,19 +2,18 @@ About
 -----
 
 READemption is a pipeline for the computational evaluation of RNA-Seq
-data. It was originally developed at the IMIB/ZINF to process dRNA-Seq
-reads (as introduced by Sharma et al., Nature, 2010) originating from
-bacterial samples. Meanwhile is has been extended to process data
-generated in different experimental setups and originating from all
-domains of life and is under active development. The subcommands which
-are accessible viaq command-line interface cover read processing and
-aligning, coverage plot generation, gene expression quantification as
-well as differential gene expression analysis. READemption was applied
-to analyze numerous data sets. In order to set up analyses quickly
+data. It was originally developed to process dRNA-Seq reads (as
+introduced by Sharma et al., Nature, 2010) originating from bacterial
+samples. Meanwhile is has been extended to process data generated in
+different experimental setups and from all domains of life. The
+functions which are accessible via a command-line interface cover read
+processing and aligning, coverage calculation, gene expression
+quantification, differential gene expression analysis as well as
+visualization. In order to set up and perform analyses quickly
 READemption follows the principal of "convention over configuration":
-Once the input files are copied into defined folders no further
+Once the input files are copied/linked into defined folders no further
 parameters have to be given. Still, READemption's behavior can be
-adapted to specific needs of the user.
+adapted to specific needs of the user by parameters.
 
 Documentation
 -------------
@@ -32,19 +31,20 @@ Short version (if you have all the requirements installed):
     $ pip install READemption
 
 `Long version <http://pythonhosted.org/READemption/installation.html>`__
-(what are the requirements and how do you get them)
+including a decryption of the requirements and how do you get them.
 
 License
 -------
 
-`ICSL <https://en.wikipedia.org/wiki/ISC_license>`__ - see LICENSE.txt
+`ICSL <https://en.wikipedia.org/wiki/ISC_license>`__ (Internet Systems
+Consortium license ~ simplified BSD license) - see LICENSE.txt
 
 Development
 -----------
 
 -  If possible follow the principal of "convention over configuration".
-   This means input file are into a fixed location and the result file
-   are placed in fixed location.
+   This means input file are copied/linked into a fixed location and the
+   resulting files are placed in fixed locations.
 
 -  The classes should be path agnostic as far a possible. The controller
    is taking care of that and calls them adequately.
@@ -64,3 +64,4 @@ Development
    -  hotfix branches - branched off from master and merged back into
       dev and master
 
+
diff --git a/bin/reademption b/bin/reademption
@@ -6,10 +6,10 @@ import argparse
 from reademptionlib.controller import Controller
 
 __author__ = "Konrad Foerstner <[email protected]>"
-__copyright__ = "2011-2013 by Konrad Foerstner <[email protected]>"
+__copyright__ = "2011-2014 by Konrad Foerstner <[email protected]>"
 __license__ = "ISC license"
 __email__ = "[email protected]"
-__version__ = "0.2.2"
+__version__ = "0.2.3"
 
 def main():
     parser = argparse.ArgumentParser()

diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -50,7 +50,7 @@
 # The short X.Y version.
 version = '0.2'
 # The full version, including alpha/beta/rc tags.
-release = '0.2.2'
+release = '0.2.3'
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.

diff --git a/docs/source/example_analysis.rst b/docs/source/example_analysis.rst
@@ -2,18 +2,16 @@ Performing an example analysis
 ==============================
 
 Here you will be guided trough a small example analysis using a
-publicly available RNA-Seq data set. We will use a data set from NCBI
-GEO that was part of a publication by `Kröger et
+publicly available RNA-Seq from NCBI GEO that was part of a
+publication by `Kröger et
 al. <http://www.ncbi.nlm.nih.gov/pubmed/24331466>`_. This is a
 transcriptome analysis of *Salmonella* Typhimurium SL1344 in different
-conditions.
-
-We will generate several output files in different formats. The CSV
-(tabular separated plain text files) files can be opened with any
-spreadsheet program like `LibreOffice <https://www.libreoffice.org/>`_
-or Excel. For inspecting the mappings (in BAM format) and coverage
-files (wiggle format) you can use a genome browser for example `IGB
-<http://bioviz.org/igb/>`_ or `IGV
+conditions. We will generate several output files in different
+formats. The CSV (tabular separated plain text files) files can be
+opened with any spreadsheet program like `LibreOffice
+<https://www.libreoffice.org/>`_ or Excel. For inspecting the mappings
+(in BAM format) and coverage files (wiggle format) you can use a
+genome browser for example `IGB <http://bioviz.org/igb/>`_ or `IGV
 <https://www.broadinstitute.org/igv/home>`_.
 
 Generating a project
@@ -145,10 +143,10 @@ created which can be found in
 ``READemption_analysis/output/align/reports_and_stats/``. It contains
 several mapping statistics for example how many reads are successfully
 aligned in total and how many were aligned to each replicon. We see
-that more than 98 % are mapped for each library. Sorted and indexed
-alignements in BAM format are stored in
+that more than 98 % of the reads are mapped for each library. Sorted
+and indexed alignements in BAM format are stored in
 ``READemption_analysis/output/align/alignments``. We could load them
-in a genome browser but instead we continue with the next step.
+into a genome browser but instead we continue with the next step.
 
 
 Generating coverage files
@@ -162,12 +160,12 @@ normalizations we use the subcommand ``coverage``.
    $ reademption coverage -p 4 READemption_analysis
 
 The sets are stored in subfolder of
-``READemption_analysis/output/coverage/``. The most often set is
-stored in ``coverage-tnoar_min_normalized``. Here the coverages are
-normalized by the total number of aligned reads (TNOAR) of the
-individual library and then multiplied by the lowest TNOAR value of
-all libs. These files could be inspected for differential RNA-Seq
-(dRNA-Seq - comparing libraries with and without Terminator
+``READemption_analysis/output/coverage/``. The most oftenly used set
+is stored in ``coverage-tnoar_min_normalized``. Here the coverage
+values are normalized by the total number of aligned reads (TNOAR) of
+the individual library and then multiplied by the lowest TNOAR value
+of all libraries. These files could be inspected for differential
+RNA-Seq (dRNA-Seq - comparing libraries with and without Terminator
 Exonuclease treatment) data in order to determine transcriptional
 start sites. They can be loaded in common genome browsers like `IGB
 <http://bioviz.org/igb/>`_ or `IGV
@@ -233,7 +231,7 @@ Create plots
 ------------
 
 Finally we generate plots that visualize the results of the different
-steps. ``viz_align`` will create histograms of the read length
+steps. ``viz_align`` creates histograms of the read length
 distribution for the untreated and treated reads (saved in
 ``READemption_analysis/output/viz_align/``).
 
@@ -244,7 +242,7 @@ distribution for the untreated and treated reads (saved in
 ``viz_gene_quanti`` visualizes the gene wise countings. In our example
 you will see that - as expected - the replicates are more similar to
 each other than to the libs of the other condition. It also generates
-bar plot that show the distribution of reads inside the different RNA
+bar plots that show the distribution of reads inside the different RNA
 classes.
 
 ::

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -16,25 +16,21 @@ Table of content
 READemption in a nutshell
 =========================
 
-*READemption* is a pipeline for the computational evaluation of
-RNA-Seq data. It was originally developed at the `IMIB/ZINF
-<http://www.imib-wuerzburg.de/>`_ to process dRNA-Seq reads (as
-introduced by Sharma *et al.*, Nature, 2010 (`Pubmed
+READemption is a pipeline for the computational evaluation of
+RNA-Seq data. It was originally developed to process dRNA-Seq reads
+(as introduced by Sharma et al., Nature, 2010 (`Pubmed
 <http://www.ncbi.nlm.nih.gov/pubmed/20164839>`_)) originating from
 bacterial samples. Meanwhile is has been extended to process data
-generated in different experimental setups and originating from all
-domains of life and is under `active development
-<https://github.com/konrad/READemption>`_. The `subcommands
-<subcommands.html>`_ which are provided by command-line interface
-cover read processing and aligning, coverage plot generation, gene
-expression quantification as well as differential gene expression
-analysis. READemption was applied to analyze numerous data sets. In order to
-set up analyses quickly READemption follows the principal of *convention
-over configuration*: Once the input files are copied into defined
-folders no further parameters have to be given. Still, READemption's
-behavior can be adapted to specific needs of the user. This tools is
-available as open source under open source license `ICSL
-<https://en.wikipedia.org/wiki/ISC_license>`_.
+generated in different experimental setups and from all domains of
+life. The `functions <subcommands.html>`_ which are accessible via a
+command-line interface cover read processing and aligning, coverage
+calculation, gene expression quantification, differential gene
+expression analysis as well as visualization. In order to set up and
+perform analyses quickly READemption follows the principal of
+*convention over configuration*: Once the input files are
+copied/linked into defined folders no further parameters have to be
+given. Still, READemption's behavior can be adapted to specific needs
+of the user by parameters.
 
 Download
 ========
@@ -63,4 +59,5 @@ Konrad U. Förstner, Jörg Vogel, Cynthia M. Sharma; (submitted).
 Contact
 =======
 
-For question and requests feel free to contact Konrad Förstner <[email protected]>
+For question and requests feel free to contact `Konrad Förstner
+<http://konrad.foerstner.org/>`_ <[email protected]>
diff --git a/docs/source/installation.rst b/docs/source/installation.rst
@@ -10,7 +10,7 @@ version. Also Python 2.7 or earlier Python 3 version can be used if
 the backported library `futures
 <https://pypi.python.org/pypi/futures>`_ is installed. In any case,
 the third party packages `pysam <https://code.google.com/p/pysam>`_ as
-well as `setuptool <https://pypi.python.org/pypi/setuptools>`_ and
+well as `setuptools <https://pypi.python.org/pypi/setuptools>`_ and
 `pip <http://www.pip-installer.org>`_ should be available on the
 system in order to make the installation easy. READemption uses the
 short read mapper `segemehl
@@ -60,14 +60,15 @@ Some comments:
   tar xzf segemehl_0_1_7.tar.gz
   cd segemehl_*/segemehl/ && make && cd ../../
 
-Copying it to a location that is part of the ``PATH`` e.g ``/usr/bin/`` ... 
+Copying the executable to a location that is part of the ``PATH`` e.g
+``/usr/bin/`` ...
 
 ::
 
   sudo cp segemehl_0_1_7/segemehl/segemehl.x /usr/bin/segemehl.x
   sudo cp segemehl_0_1_7/segemehl/lack.x /usr/bin/lack.x
 
-... or the bin folder of you home directory::
+... or the bin folder of your home directory::
 
   mkdir ~/bin
   cp segemehl_0_1_7/segemehl/segemehl.x ~/bin

diff --git a/docs/source/subcommands.rst b/docs/source/subcommands.rst
@@ -168,12 +168,13 @@ positions. To turn off this behavior use
 gene_quanti
 -----------
 
-With ``gene_quanti`` the number of reads to each annotation entry is
-counted and the results are combined in tables. At least one GGF file
-with the annotations have to be placed in ``input/annotations``. The
-sequence ID of the sequenced must be precisely the same as the IDs
-used in the reference sequence FASTA files. To specify the feature
-classes (e.g. CDS, gene, rRNA, tRNA) that should be quantified the
+With ``gene_quanti`` the number of reads overlapping with each of the
+annotation entries is counted and the results are combined in
+tables. At least one GGF3 file with annotations has to be placed in
+``input/annotations``. The sequence ID of the sequenced must be
+precisely the same as the IDs used in the reference sequence FASTA
+files. To specify the feature classes (the third column in the GFF3
+file e.g. CDS, gene, rRNA, tRNA) that should be quantified the
 parameter ``--features`` can be used. Otherwise countings for all
 annotation entries are generated. Per default sense and anti-sense
 overlaps are counted and separately listed.

diff --git a/setup.py b/setup.py
@@ -5,17 +5,17 @@
 
 setup(
     name='READemption',
-    version='0.2.2',
+    version='0.2.3',
     packages=['reademptionlib', 'tests'],
     author='Konrad U. Förstner',
     author_email='[email protected]',
-    description='READemption - A RNA-Seq Analysis Pipeline',
+    description='A RNA-Seq Analysis Pipeline',
     url='',
     install_requires=[
         "pysam >= 0.7.7"
     ],
     scripts=['bin/reademption'],
-    license='LICENSE.txt',
+    license='ISC License (ISCL)',
     long_description=open('README.rst').read(),
     classifiers=[
         'License :: OSI Approved :: ISC License (ISCL)',