EAGLE is a program to map minimal Relative Absent Words (mRAWs). EAGLE identifies and localizes the mRAWs contained in a range size of k-mers, running on a command-line environment with multi-threads to minimize computation times. It contains extensions to estimate CG distributions and create automatic plots (Gnuplot). It works on FASTA data without size limitations.
CMake must be installed to compile EAGLE. CMake can be downloaded from the CMake webpage (http://www.cmake.org/) or by an appropriate packet manager. The following instructions show the procedure to install and compile EAGLE manually:
git clone https://github.com/pratas/eagle.git
cd eagle/src/
cmake .
make
External and complementary dependencies to download, align and visualize the data require conda installation.
Steps to install conda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
Additional instructions can be found here:
https://docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html
To install the dependencies using conda:
conda install -c cobilab gto --yes
conda install -c bioconda tabix --yes
conda install -c bioconda bowtie2 --yes
conda install -c bioconda samtools --yes
conda install -c bioconda entrez-direct --yes
conda install -c bioconda/label/cf201901 entrez-direct --yes
Run EAGLE using:
./EAGLE -v -t -min 11 -max 14 -p -r Human.fna SARS-CoV-2.fa
To see the possible options type
./EAGLE
or
./EAGLE -h
These will print the following options:
NAME
EAGLE v2.3 2015-2020
Efficient computation of minimal Relative Absent
Words (mRAWs) and its associated GC distributions,
profiles, and patterns.
AUTHORS
D. Pratas and J. M. Silva.
SYNOPSIS
./EAGLE [OPTION]... [FILE] [FILE]
SAMPLE
Run: ./EAGLE -v -min 11 -max 16 human.fa SARS-CoV2.fa
DESCRIPTION
Localization and quantification of minimal Relative
Absent Words (mRAWs) and GC associated measures
-h, --help
usage guide (help menu)
-V, --version
display program and version information
-f, --force
force mode. Overwrites old files
-v, --verbose
verbose mode (more information)
-vv, --very-verbose
very verbose mode (much more information)
-t, --threads
does NOT use threads if flag is set (slower)
-i, --ignore-ir
does NOT use inverted repeats if flag is set
-c, --ignore-profiles
does NOT compute GC profiles
-o, --stdout
write overall statistics to standard output
-p, --plots
print Shell code to generate plots (gnuplot)
-min [NUMBER], --minimum [NUMBER]
k-mer minimum size (usually 10)
-max [NUMBER], --maximum [NUMBER]
k-mer maximum size (usually 16)
[FILE]
Input FASTA reference (e.g. human) -- MANDATORY.
This content will be loaded in the models.
[FILE]
Input FASTA target (e.g. SARS-CoV-2) -- MANDATORY.
The mRAWs will be mapped on this content file.
COPYRIGHT
Copyright (C) 2014-2020, IEETA/DETI, University of Aveiro.
This is a Free software, under GPLv3. You may redistribute
copies of it under the terms of the GNU - General Public
License v3 <http://www.gnu.org/licenses/gpl.html>. There
is NOT ANY WARRANTY, to the extent permitted by law.
Version 2.2:
- D. Pratas, J. M. Silva. Persistent minimal sequences of SARS-CoV-2. Bioinformatics (2020): btaa686. URL.
version 1.0:
- R. M. Silva, D. Pratas, L. Castro, A. J. Pinho & P. J. S. G. Ferreira. Three minimal sequences found in Ebola virus genomes and absent from human DNA. Bioinformatics (2015): btv189. URL.
For any issue let us know at issues link.
GPL v3.
For more information:
http://www.gnu.org/licenses/gpl-3.0.html