GenomeQC: Genome Assembly and Annotation Metrics

How to cite GenomeQC

GenomeQC: A quality assessment tool for genome assemblies and gene structure annotations Nancy Manchanda, John L. Portwood II, Margaret R. Woodhouse, Arun S. Seetharam, Carolyn J. Lawrence-Dill, Carson M. Andorf, Matthew B. Hufford

https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-6568-2

GenomeQC: Genome Assembly and Annotation Metrics

GenomeQC generates descriptive summaries with intuitive graphics for genome assembly and structural annotations. It also benchmarks user supplied assemblies and annotations against the publicly available reference genomes of their choice. It is optimized for small and medium sized genomes (<2.5 Gb) and has pre-computed results for several maize genomes.

There is a Dockerfile available (with the associated scripts) to run the pipeline without installing any dependencies.

Installation

Bioinformatics software dependencies

GenomeQC web application calls upon the following bioinformatics tools and database to perform computation. These tools needs to be installed and configured in the path of the working directory.

At the time of release, this application was tested with:

BUSCO v3.0.2 (https://gitlab.com/ezlab/busco) software and its dependencies
NCBI BLAST+ v2.28.0 (https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+),
HMMER v3.1b2 (http://hmmer.org/)
Augustus v3.2.1 (http://bioinf.uni-greifswald.de/augustus/)
Gffread 0.9.12 (http://ccb.jhu.edu/software/stringtie/gff.shtml\#gffread\_dl)
NCBI UniVec Database (ftp://ftp.ncbi.nlm.nih.gov/pub/UniVec/)
Blobtools v1.1 taxify (https://blobtools.readme.io/docs/taxify)

GenomeQC components:

GenomeQC is a collection of R and Python scripts. These R scripts need to be placed in the directory of R Shiny package.

The two main scripts necessary to run the application are ui.R and server.R.

ui.R : This script is the source of user interface definition which lays out the user interface.

server.R: This script, which can be found in the scripts folder of the GenomeQC Github repository, calls various packages and python and bash scripts for calculating different metrics.

Running GenomeQC requires a Linux server, R shiny (version 1.5.9) and Python (version 3.6). Furthermore, it requires the following packages:

R packages
tools	R.utils	shinyWidgets
seqinr	tidyverse	shinyBS
Biostrings	gridExtra	reshape
stringr	grid	cowplot

Python packages
sys	traceback	Bio.Blast.Applications
os	subprocess	iglob
Bio	Statistics	pandas
re	Numpy	plotly.offline
argparse	collections	plotly.graph_objs

Operating Instructions

Three modes are available:

Compare reference genomes:

This section outputs various pre-computed assembly and annotation metrics from a user-selected list of reference genomes.

Analyze your genome assembly:

This section provides the user the option to perform analysis on their genome assembly as well as benchmark their analysis with the pre-computed reference genomes.

Analyze your genome annotations:

This section provides the user the option to perform analysis on their genome annotations as well as benchmark their analysis with the pre-computed reference annotations.

See also an online version of the manual for more details: GenomeQC_userguide.pdf

Licensing

GNU GPL V3.

Acknowledgements

Funding: This work was supported by the United State Department of Agriculture (USDA).

Please send questions to: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenomeQC: Genome Assembly and Annotation Metrics

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
Docker		Docker
old		old
scripts		scripts
GenomeQC_userguide.pdf		GenomeQC_userguide.pdf
README.md		README.md
server.R		server.R
ui.R		ui.R

nm100/GenomeQC

Folders and files

Latest commit

History

Repository files navigation

GenomeQC: Genome Assembly and Annotation Metrics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages