modelmatcher: Rapid identification of evolutionary models

This tool reads multiple sequence alignments and determines a suitable sequence evolution model for your phylogenetic analysis.

Usage

Example usage:

$ modelmatcher inputfile.fasta

The input file is a multiple sequence alignmnent in one of these common formats:

FASTA
Clustal
NEXUS
PHYLIP
STOCKHOLM

The output is a list of models, in order of fit to data, and their modelmatcher score. The base model (such as JTT, WAG, LG, etc) is predicted, as well as whether one should adapt to the alignments amino acid composition (i.e., JTT+F, WAG+F, etc).

If you want to automatically feed the prediction from modelmatcher to a phylogenetic inference software, consider using the -of option:

iqtree  -s infile.phy  -m $(modelmatcher -of iqtree infile.phy)

The dollar-parenthesis is a subcommand and the output is a single model name. Only models accepted by the given application (here: IQTREE) are output.

Options

Optional options:

  -h, --help            show this help message and exit
  -f {guess,fasta,clustal,nexus,phylip,stockholm}, --format {guess,fasta,clustal,nexus,phylip,stockholm}
                        Specify what sequence type to assume. Be specific if
                        the file is not recognized automatically. When reading
                        from stdin, the format is always guessed to be FASTA.
                        Default: guess
  -m filename, --model filename
                        Add the model given in the file to the comparisons.
  -nf, --no-F-testing   Do not try +F models, i.e., do not test with amino
                        acid frequencies estimated from the MSA.
  -s int, --sample-size int
                        For alignments with many sequences, decide on an upper
                        bound of sequence pairs to use from the MSA. The
                        computational complexity grows quadratically in the
                        number of sequences, so a choice of 5000 bounds the
                        growth for MSAs with more than 100 sequence.
  -of {tabular,json,iqtree,raxml,phyml,mrbayes}, --output_format {tabular,json,iqtree,raxml,phyml,mrbayes}
                        Choose output format. Tabular format is default. JSON
                        is for convenient later parsing, with some additional
                        meta-data added. For one-line output convenient for
                        immediate use by inference tools, consider raxml and
                        similar choices. Note that the PhyML and MrBayes
                        options are restricted to their implemented models.
                        Although PhyML supports the +F models (using the "-f
                        e" option), this is not reflected in the output from
                        "modelmatcher -of phyml ..." at this time.
  --list-models         Output a list of models implemented in modelmatcher,
                        then exit.
  --verbose             Output progress information
  --version

See the section "Output" below for some more examples.

Input formats

Input format is detected automatically from the following list, but can also be requested specifically.

FASTA
Phylip
Nexus
Clustal
Stockholm

Output

The default output is given as a simple text table, or in JSON format for easy parsing by other scripts, ranking possible models in preference order. For example, the command above may yield a table looking like:

WAG             7.972
VT              8.238
BLOSUM62        8.478
JTT             8.864
JTT-DCMUT       8.917
LG              9.984
DCMUT          10.467
Dayhoff        10.495
FLU            11.211
HIVb           12.853
RtREV          14.048
cpREV          14.186
HIVw           17.338
MtZoa          18.476
MtMAM          21.453
mtArt          21.741
MtREV          22.059

Each model is given with its modelmatcher score.

Alternatively, the same analysis can look like:

$ modelmatcher  --json  inputfile.fasta
{"n_observations": 863692, "infile": "inputfile.fasta", "n_seqs": 66, "model_ranking": [["WAG", 7.972410383355675], ["VT", 8.238362164888876], ["BLOSUM62", 8.478000205922985], ["JTT", 8.863578165338444], ["JTT-DCMUT", 8.917496451351846], ["LG", 9.983874357603963], ["DCMUT", 10.466872509785343], ["Dayhoff", 10.49522598111376], ["FLU", 11.21137482805874], ["HIVb", 12.852877789672046], ["RtREV", 14.047539707772572], ["cpREV", 14.18648653904322], ["HIVw", 17.338193829402], ["MtZoa", 18.475515151949153], ["MtMAM", 21.452528293860837], ["mtArt", 21.740741039472418], ["MtREV", 22.058622800684176]]}

Install

Recommended installation is:

pip install --upgrade pip
pip install modelmatcher

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
bin		bin
modelmatcher		modelmatcher
tests		tests
.sonarcloud.properties		.sonarcloud.properties
CHANGES.md		CHANGES.md
L100_sd0.2_cpREV_n19.phylip		L100_sd0.2_cpREV_n19.phylip
README.md		README.md
THANKS.md		THANKS.md
TODO.md		TODO.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

modelmatcher: Rapid identification of evolutionary models

Usage

Options

Input formats

Output

Install

About

Releases 3

Packages

Languages

arvestad/modelmatcher

Folders and files

Latest commit

History

Repository files navigation

modelmatcher: Rapid identification of evolutionary models

Usage

Options

Input formats

Output

Install

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages