To go to top of the Software pages
To previous part of Software pages
Mark Wilkinson,
of the Department of Zoology, The Natural History Museum, London, U.K.
(marw (at) nhm.ac.uk) has
produced REDCON, version 3.0, a program to implement his
method of
reduced consensus trees. These find a tree with possibly fewer species that
satisfies a strict or a majority rule consensus criterion. REDCON reads
trees in PAUP* format. It is a DOS
executable, and is available at
his software Web site
at http://www.nhm.ac.uk/zoology/external/mwphylogeny.htm.
Mark Wilkinson,
of the Department of Zoology, The Natural History Museum, London, U.K.
(marw (at) nhm.ac.uk) has
produced TAXEQ3, a program to carry out Safe Taxonomic
Reduction, which means dropping some species to get a set whose phylogenetic
relationships are less ambiguous. The method is described in a paper:
Wilkinson, M. 1995. Coping with abundant missing entries in phylogenetic
inference using parsimony. Systematic Biology 44: 501-514.
TAXEQ3 is distributed
as a DOS executable with documentation and sample data set from
his software Web site
at http://www.nhm.ac.uk/zoology/external/mwphylogeny.htm.
Lars Jermiin
of the School of Biological Sciences
of the University of Sydney, Australia (lars.jermiin (at) usyd.edu.au)
and Olena Anpilogova
have produced TreeCons version 1.0. It
generates a weighted consensus tree from trees obtained by maximum likelihood analysis,
generates relative likelihood support on edges in this and other user-specified trees,
and does the Kishino-Hasegawa test with any level of significance. It
reads output files and tree files produced by some of the programs in
PHYLIP,
MOLPHY,
fastDNAml
and TrExMl.
The output file from TreeCons is in a format that then is fed back into
PHYLIP's program Consense. A number of weighting schemes to compute tree
weights from their likelihoods are allowed.
The weighting schemes and the underlying theory are described in a paper:
Jermiin L. S., G. J. Olsen, K. L. Mengersen, and S. Easteal. 1997.
Majority-rule consensus of
phylogenetic trees obtained by maximum likelihood analysis.
Molecular Biology and Evolution 14: 1296-1302.
TreeCons is distributed as C source code. It is available, with documentation
and sample input and output, from
its web site at
http://jcsmr.anu.edu.au/dmm/humgen/lars/treeconssub.htm.
Joseph Thorley
of Treleaver, Mithian Downs, St. Agnes,
Cornwall TR5 0PY, U.K. (joethorley (at) bigfoot.com)
and Rod Page of the University of Glasgow have
written RadCon, version 1.1.6, a program to compute consensus
trees, supertrees, measures of the shape of trees, and to rearrange trees.
It can compute strict, semi-strict, Adams, and
majority-rule consensus trees, Reduced consensus trees, MRP supertrees,
Cladistic Information Content, Leaf Stability, and Double Decay
Analysis. It can also measure the shape and
resolution of trees, prune and regraft leaves, add outgroups, and reroot
trees. It is described in a paper:
Thorley, J. L. and R. D. M. Page. 2000. RadCon: Phylogenetic comparison and
consensus. Bioinformatics 16: 486-487.
RadCon is a MacOS executable for MacOS 7.5 or later. RadCon is available at
its web site at
http://web.onetel.com/~joethorley/radcon/radcon.html
and its manual can also be downloaded or viewed from there.
Jeet Sukumaran
of the Division of Herpetology of the University of Kansas Natural History Museum and Biodiversity
Research Center at the University of Kansas, Lawrence, Kansas
(jeetsukumaran (at) frogweb.org)
distributes bootscore
version 3.11, a program to compute bootstrap support from boostrap replicate
trees and place them on a consensus tree. A platform-independent Python
script maps non-parametric bootstrap support for clades onto a
phylogenetic tree. It outputs a NEXUS/Newick treefile with the topology of the
given tree and with clade support indicated by node labels or branch lengths.
In its default bipartition-counting mode, it identifies all distinct
bipartitions in the tree to be evaluated, and then scans through a file of
bootstrap replicates to identify the percentage or proportional frequency of
occurance of each of those bipartitions in each of the bootstrap trees. It can
also operate in a clade-counting mode, in which it identifies all distinct
monophyletic groups in the tree being assessed, and then counts the number of
bootstrap trees in which that particular monophyletic group is recovered.
It is available as a Python script. It can be downloaded from
its web site
at http://bootscore.sourceforge.net/
Vladimir Makarenkov
(makarenv (at) magellan.umontreal.ca) of the
Departement d'Informatique of the
Université du Québec à Montréal and
the Département de Sciences Biologiques of the
Université de Montréal, Québec has written
a program to compute the
Robinson and Foulds topological distance. This is
a distance measure between trees, counting the numbers of branches on the
two trees that have no counterpart on the other tree. The algorithm used
is described in a paper:
Makarenkov, V., and Leclerc, B. 2000. An optimal way to compare additive trees
using circular orders. Journal of Computational Biology 7:
731-744. The program is available as C source code and as DOS executable
and a Mac OS executable from
its web site
at http://www.bio.umontreal.ca/Casgrain/en/labo/robinson_foulds.html.
Pere Puigbò Avalos
of the Evolutionary Genomics Group, Biochemistry and Biotechnology Department
of Rovira i Virgili University, Tarragona, Spain
(ppuigbo (at) urv.cat)
has produced TopD/fMts
(TOPological Distance / From Multiple To Single),
version 1.0, program to calculate distances between trees. TopD calculates distances between trees by the following methods: split distance. nodal distance, disagreement, taxa in common, and quartets distance. fMtS is used with it to subsample gene families to get trees with one copy per species.
It is described in the paper:
Puigbò P., S. Garcia-Vallvé and J. O. McInerney. 2007. TOPD/FMTS: a new software to compare phylogenetic trees. Bioinformatics, published online on April 25, 2007.
It is available as Perl script. It can be downloaded from
its web site
at http://genomes.urv.es/topd/
Chris Creevey
(chris.creevey (at) may.ie) of the Bioinformatics
and Pharmacogenomics Laboratory at National University of Ireland Maynooth
has written Clann (the Irish word for "family") version
2.0.3, a program to computer supertrees. It implements most of the major
supertree methods, including matrix representation by parsimony (MRP),
methods involving distance matrices, quartets methods, and splits
methods, and includes a bootstrap method that samples from among the
input trees. Clann is described in a paper: Creevey, C. J. and J. O. McInerney.
2005. Clann: investigating phylogenetic information through supertree
analyses. Bioinformatics 21: 390-392. It is available
as Windows, Mac OS X, and Linux executables from its web site at http://bioinf.may.ie/software/clann/.
Nicolas Salamin
at the Department of Ecology and
Evolution of the University of Lausanne, Switzerland has written SuperTree, a program to take a set of input trees and build a matrix
representation of them that can be used to compute a supertree using the
MRP (matrix representation by parsimony) method. The program does not do
the parsimony reconstruction from this matrix itself. The options available
are described in a paper: Salamin, N., T. R. Hodkinson, V. Savolainen.
2002. Building supertrees: An empirical assessment using the grass family
(Poaceae). Systematic Biology 51: 136 - 150.
SuperTree is a Java runtime executable which will work on
Linux, Windows, and Mac OS X systems, as long as they have the Sun Java
Runtime JRT system version 1.3 or greater. The Java source code can be
obtained from Salamin, if desired, by contacting him by email.
The program and its distribution are briefly described at
Salamin's software web site at
http://www2.unil.ch/phylo/software.html
The program can be obtained from Salamin by emailing him (wwwphylo (at) unil.ch).
Olaf Bininda-Emonds at
Technische Universität München, Germany (Olaf.Bininda (at) tierzucht.tum.de) has produced a set of Perl scripts which, taken together,
are a Supertree package which can make and examine supertrees
using the program PAUP*. Although they
are each downloadable
separately, I will consider them here to be a single entity. They include,
in rough order of the steps carried out:
- synonoTree.pl, which standardizes the taxon labels in a set of
source trees according to a reference taxonomy.
- treePruner.pl, which takes a set of NEXUS-formatted trees and
prunes them to the set of tips that they all have in common.
- SuperMRP.pl, which converts a NEXUS-formatted treefile into a NEXUS-
formatted MRP (Matrix Representation by Parsimony) data set ready for
analysis by PAUP*.
- QualiTree.pl, which calculates support for clades in a supertree
relative to the clades in the source trees that made it up.
- ReverseMRP.pl which does the reverse, making the NEXUS MRP data set into a treefile of trees.
Additional Perl scripts are also present to help with labeling the
resultant supertrees.
They are described in the papers:
- Bininda-Emonds, O. R. P. 2003. Novel versus unsupported clades:
assessing the qualitative support for clades in MRP supertrees.
Systematic Biology 52: 839-848.
- Bininda-Emonds, O.R.P., K.E. Jones, S.A. Price, M. Cardillo, R.
Grenyer, and A. Purvis. 2004. Garbage in, garbage out: data issues in
supertree construction. Pp. 267-280 in Phylogenetic Supertrees:
Combining Information to Reveal the Tree of Life, ed. O. R. P.
Bininda-Emonds. Computational Biology, volume 4. Kluwer Academic
Publishers, Dordrecht, the Netherlands.
- Bininda-Emonds, O.R.P., R.M.D. Beck, and A. Purvis. 2005. Getting to
the roots of matrix representation. Systematic Biology 54:
668-672.
As they are Perl scripts, they can be run on any system on which Perl has been
made available. They can be downloaded (individually) from
Bininda-Emonds's software web page
at http://141.40.125.5:8080/WWW/Homepages/Bininda-Emonds/ProgramsMain.html#Supertrees
Rod Page
of the Environmental and Evolutionary Biology
of the Institute of Biomedical and Life Sciences, University of Glasgow, U.K.
(r.page (at) bio.gla.ac.uk)
has written Supertree
version 0.3, for constructing supertrees. Supertree is an experimental command line program for constructing supertrees. It implements Semple and Steel's algorithm MinCutSupertree (original and a modified version due to Page), as well as MRP coding. It can also compute cluster graphs.
It is described in the paper:
Page, R. D. M. 2002. Modified MinCut supertrees. pp. 537-551 in Algorithms in Bioinformatics: Proceedings of the Second International Workshop, WABI 2002, Rome, Italy, September 17-21, 2002. Springer-Verlag, Heidelberg.
It is available as C source code, Mac OS X PowerMac executables and Mac OS 9 executables. To compile the source code, one needs a copy of the GTL
Graph Template Library, which must be installed before compiling.
Supertree can be downloaded from
its web site
at http://darwin.zoology.gla.ac.uk/%7Erpage/supertree/
Duhong Chen
of the Computational Biology Laboratory of the Department of Computer Science
of Iowa State University
(duhong (at) iastate.edu)
has written HeuristicMRF2 (Matrix Representation with
Flipping version 2),
a program to construct supertrees by the MRF method.
Matrix representation with flipping (MRF) starts with a Matrix Representation
by Parsimony matrix. The binary MRP matrix from a rooted input tree may be
transformed into a subset of the columns of a rooted supertree by changing some
of the 0's to 1's and 1's to 0's. Each change in the character state is a
flip, and the minimum number of flips needed to transform the input tree into a
supertree is the flip distance. The MRF heuristic seeks the rooted supertree(s)
that minimizes the total flip distance from all input trees. It searches for
rooted trees, not unrooted trees. It is described in the paper:
Chen, D., L. Diao, O. Eulenstein, D. Fernández-Baca, and M. J.
Sanderson. 2003. Flipping: A supertree construction method. Pp. 135-160 in
Bioconsensus., ed. M. F. Janowitz, F.-J. Lapointe, F. R. McMorris,
B. Mirkin, and F. S. Roberts. DIMACS series in discrete mathematics and
theoretical computer science, Volume 61. American Mathematical Society,
Providence, Rhode Island.
It is available as C++ source code. It can be downloaded from
its web site
at http://genome.cs.iastate.edu/CBL/download/
Raul Piaggio
of the Computational Biology Laboratory, Department of Computer Science
at Iowa State University, Ames, Iowa
(rpiaggio (at) iastate.edu)
has written Quartet Suite
version 1.0, a set of programs for computing supertrees and also distances
between trees from quartets. Quartet Suite is a set of four programs that
take input trees, break them down into the set of quartets implied by each of
them, and construct a supertree based on these quartets. They can also
compute a distance between trees from the sets of quartets they imply.
The methods are described in the paper:
Willson, S.J. 1999. Building phylogenetic trees from quartets by using local inconsistency measures. Molecular Biology and Evolution 16: 685-693.
It is available as C++ source code, Windows executables and Powermac Mac OS X executables. It can be downloaded from
its web site
at http://genome.cs.iastate.edu/CBL/download/
Duhong Chen, Oliver Eulenstein, and David Fernández-Baca
of the Computational Biology Laboratory of the Department of Computer Science
at Iowa State University, Ames, Iowa
(duhong (at) iastate.edu)
have released Rainbow
version beta 1.2, a toolbox for phylogenetic supertree construction and
analysis. Rainbow provides a user-friendly environment in which scientists
can utilize tools for building and analyzing supertrees. Rainbow provides a
graphic user interface (GUI) to construct supertrees using several different
methods. Currently these include matrix representation with flipping (MRF)
matrix representation with parsimony (MRP), and the modified Mincut algorithm
(MMC). Rainbow also provides tools to analyze the quality of the inferred
supertrees. The methods are described in the paper:
Chen, D., L. Diao, O. Eulenstein, D. Fernández-Baca, and M. J. Sanderson. 2003. Flipping: A supertree construction method. Pages 135-160 in Bioconsensus, ed. M. F. Janowitz, F.-J. Lapointe, F. R. McMorris, B. Mirkin, and F. S. Roberts. Volume 61 in DIMACS series in discrete mathematics and theoretical computer science, American Mathematical Society, Providence, Rhode Island.
It is available as C++ source code, Windows executables, Linux executables
and Powermac Mac OS X executables. It can be downloaded from
its web site
at http://genome.cs.iastate.edu/CBL/download/
Vincent Ranwez, Vincent Berry, Alexis Criscuolo, Pierre-Henri Fabre,
Sylvain Guillemot, Celine Scornavacca and Emmanuel Douzery
of the Institut des Sciences de L'evolution and the LIRMM
at the Université Montpellier II, France
(vberry (at) lirmm.fr)
have produced PhySIC
(Phylogenetic Supertrees with Induction and non-Contradiction),
a supertree program that uses an algorithm they developed. It carries out construction of supertrees using the non-contradiction property (PC) and the induction property (PI). The former requires that the supertree does not contain relationships that contradict one or a combination of the source topolo- gies, while the latter requires that all topological information contained in the supertree is present in a source tree or collectively induced by several source trees. The program also can collapse branches in the input trees that have bootstrap values that are smaller than a threshold set by the user.
It is described in the paper:
Ranwez, V., V. Berry, A. Criscuolo, P.H. Fabre, S. Guillemot, C. Scornavacca
and E.J.P. Douzery. 2007. PhySIC: a veto supertree method with desirable
properties. Systematic Biology 56 (5): 798-817.
It is available as Linux executables and Intel Mac OS X executables. It can be downloaded from
its web site
at http://atgc.lirmm.fr/SuperTree/PhySIC/
Ahmed Moustafa
of the Computational Genetics program
at the University of Iowa, Iowa City, Iowa
(ahmed (at) users.sourceforge.net)
has written PhyloSort
version 1.3, a Java tool for sorting phylogenies searching for user-specified subtrees that a particular monophyletic group. PhyloSort is for
- Searching for monophyletic relationship among groups of taxa.
- Filtering by bootstrap support values associated with the monophyletic clades.
- Filtering by tree complexity (number of taxa in a tree).
- Filtering by family complexity (number of genes per taxon in a tree).
- Clustering trees (genes) into tree clusters (gene families).
- Using a reference tree to taxonomically group (tree-format).
It is available as Java source code and Java executables. It can be downloaded from
its web site
at http://phylosort.sourceforge.net/
Ross Crozier
(Ross.Crozier (at) jcu.edu.au)
of the School of Tropical Biology, James Cook University, Townsville, Australia
and Paul-Michael Agapow
of the
Department of Biology of Imperial College at Silwood Park, U.K.
(p.agapow (at) ucl.ac.uk)
have written CONSERVE, version 3.1.2, a 68k Macintosh program
to use phylogenetic information to calculate biodiversity and test the
feasability of conservation schemes. It measures the distinctiveness of
species using genetic distances and also to test whether particular assemblages
of populations preserve statistically significantly more biodiversity
than other assemblages. Biodiversity is determined using GD (probability
of more than one allele) or PD (length of evolutionary history) methods,
from data in the form of unrooted trees produced in standard treefile format.
It is available as a Macintosh executable for the 680x0 processor
(and thus can run in emulation mode in Mac OS) in a self-extracting archive
from
its web site at http://www.bio.ic.ac.uk/evolve/software/conserve/index.html.
George Weiller
, of the Genomic Interactions Group at the Research
School of Biological Sciences of the Australian National University, Canberra
(georg.weiller (at) rsbs.anu.edu.au) has released TreeDis
version 2.0. TreeDis finds the patristic distances (total length of branches
between all pairs of taxa in a phylogeny. It takes as input the tree file in
Newick standard form or in the format for NJTREE.
It is distributed as a DOS executable (a C++ source code version can also
be obtained from Weiller). It is available from
its web site
at http://www.rsbs.anu.edu.au/Products&Services/BiotechnologyTransferUnit/tredis.asp.
Nicolas Bortolussi, Eric Durand, Michael Blum, and Olivier François
of the Institut d'Informatique et Mathématiques Appliquées
in Grenoble, France
(nicolas.bortolussi (at) imag.fr)
have written apTreeshape
(analyses of phylogenetic Treeshape),
version 1.3.1, an R package for simulation and analysis of phylogenetic tree
topologies using statistical indices . apTreeshape computes a variety of
statistics and tests on tree shape. It is a companion library of the
APE
package. It provides additional functions for reading, plotting, manipulating
phylogenetic trees. It also offers convenient web-access to public databases,
and enables testing null models of macroevolution using corrected test
statistics. Trees of class "phylo" (from the APE package) can be converted easily.
It is available as an R package, Windows executables and Powermac Mac OS X executables. It can be downloaded from
its web site
at http://cran.r-project.org/src/contrib/Descriptions/apTreeshape.html
Andrew Rambaut
of the Institute of Evolutionary Biology
of the University of Edinburgh, Edinburgh, Scotland
(a.rambaut (at) ed.ac.uk)
and Alexei Drummond of
the University of Auckland, New Zealand
have written TreeStat
(Tree Statistics),
version 1.1, a phylogeny package with parsimony, distance and likelihood methods. TreeStat is an application that can process a set of trees in a PHYLIP or NEXUS format tree file and calculate a number of summary statistics for each. These are saved in a tab-delimited file for analysis in Tracer or statistics packages. A range of summary statistics are included:
- Tree-Balance Statistics. Colless's, B1, N_bar, cherry count.
- Tree shape.Tree height, node heights, treeness, Pybus' gamma.
- Population genetic. Fu and Li's D, external/internal ratio.
- Other. Tree length, root-to-tip lengths.
It is available as Java executables, Windows executables and Mac OS X universal executables. It can be downloaded from
its web site
at http://tree.bio.ed.ac.uk/software/treestat/
Andrew Purvis
of the Department of Biology, Imperial College, Silwood Park, U.K.
(a.purvis (at) ic.ac.uk)
and Andrew Rambaut
(andrew.rambaut (at) zoo.ox.ac.uk) of the Department of Zoology,
University of Oxford, England, have written CAIC (Comparative Analysis of
Independent Contrasts), version 2.6.9. It is a Macintosh Mac OS program that
carries out the
contrasts method but with some modifications by others to
cope with lack of resolution of the phylogeny. It will run on
Macintosh Mac OS, and is available free from
CAIC's Web page
http://www.bio.ic.ac.uk/evolve/software/caic/index.html.
It is described in the paper by A. Purvis and A. Rambaut (1995) Comparative analysis by independent contrasts (CAIC): an
Apple Macintosh application for analysing comparative data. Computer Applications in the Biosciences (CABIOS) 11: 247-251.
Emília Martins
(emartins (at) indiana.edu), of the
Department of Biology of the University of Indiana, Bloomington, Indiana,
has released
version 4.6 of COMPARE, a package of programs for comparative
methods analysis. COMPARE includes various programs for conducting
statistical analyses of comparative data in a phylogenetic
context. At the moment, it includes programs to compute
independent contrasts, do spatial autocorrelation analyses, sum of
squares parsimony, generate
random data, trees and/or branch lengths, and various other
things.
COMPARE is written in Java and is available both as standalone Java
(including source code)
and also as a Compare server. It requires a Java runtime
environment. COMPARE is available from
its web site at
http://compare.bio.indiana.edu/. Earlier Windows and
Sun Solaris executables and C source code of COMPARE 3.1 are available from the
COMPARE 3.1 web site
at http://compare.bio.indiana.edu/indexV3.html.
Emília Martins
(emartins (at) indiana.edu), of the
Department of Biology of the University of Indiana, Bloomington, Indiana,
has written CMAP, the
Comparative Method Analysis Package, for comparative methods analysis.
This package was developed when she and Ted Garland were conducting the
simulation study described in the paper:
Martins, E. P. and T. Garland, Jr. 1991. Phylogenetic analyses of the correlated evolution of continuous characters: a simulation study. Evolution
45: 534-557.
It can be used to estimate the correlation between two continuous characters
measured in different species while taking phylogenetic information into account. Methods for doing so include several versions of Felsenstein's (1985) independent contrasts, and the sum-of-squared-changes parsimony algorithm.
The programs in CMAP are described by Martins as "slow" and "unfriendly".
The executables are available only for DOS machines. She is no longer
developing this package, and is now concentrating her efforts on
her other package COMPARE, which will be able to
do everything that CMAP can. CMAP is available
from its download area at http://compare.bio.indiana.edu/ftp/.
Patrik Lindenfors (Patrik.Lindenfors (at) zoologi.su.se), of the Department of Zoology, Stockholm University Sweden, and
the Department of Biology, University of Virginia, Charlottesville (Patrik.Lindenfors (at) virginia.edu), has written
CoSta version 1.03, a DOS program which carries out the Contingent
States Test for the correlation of changes in two characters along a tree, which is described in the paper: Sillén-Tullberg, B. 1993.
The effect of biased inclusion of taxa on the correlation between discrete
characters in phylogenetic trees. Evolution 47: 1182-1191.
The program reads MacClade data files, and also text files saved from
MacClade. The program can be fetched at
its
Web site at http://www.zoologi.su.se/research/Lindenfors/CoSta.html.
Theodore Garland, Jr.
, of the Department of Biology of the University
of California, Riverside (tgarland (at) ucr.edu)
and his colleagues (Jason A. Jones, Allan W. Dickermann, Peter E.
Midford, and Ramon Diaz-Uriarte)
have developed PDAP version 6.0,
Phenotypic Diversity Analysis Programs, a series of DOS programs to
perform various comparative analyses.
At present, the following phylogenetically based statistical methods are
included: independent contrasts, squared-change parsimony reconstructions of
ancestral states and estimation of evolutionary correlations, and phylogenetic
analysis of covariance via computer-simulated (Monte Carlo) null distributions.
PDTREE can also read, write, and edit trees.
PDAP distribution is described in
a web page
at http://www.biology.ucr.edu/people/faculty/Garland/PDAP.html.
The original published description of PDAP is the paper:
Garland, T., Jr., A. W. Dickerman, C. M. Janis, and J. A. Jones. 1993.
Phylogenetic analysis of covariance by computer simulation. Systematic
Biology 42:265-292.
The methods used are described in a number of recent papers by these
authors, including:
- Purvis, A., and T. Garland, Jr.
1993. Polytomies in comparative analyses of continuous characters.
Systematic Biology 42: 569-575.
- Diaz-Uriarte, R., and T. Garland, Jr. 1996. Testing
hypotheses of correlated evolution using phylogenetically independent
contrasts: sensitivity to deviations from Brownian motion.
Systematic Biology 45: 27-47.
- Garland, T., Jr, P. E. Midford, and A. R. Ives. 1999. An introduction to
phylogenetically based statistical methods, with a new method for confidence
intervals on ancestral states. American Zoologist 39:
374-388.
- Garland, T., Jr., and A. R. Ives. 2000. Using the past to predict the present:
Confidence intervals for regression equations in phylogenetic comparative
methods. American Naturalist 155:346-364.
PDAP is distributed by email of a self-extracting
executable file, obtainable for free (contact Garland by e-mail at the above
address). Alternatively, a DOS disk can be mailed.
Liam Revell
of the Department of Organismic and Evolutionary Biology
of Harvard University
(lrevell (at) fas.harvard.edu)
has released IDC (IndepenDent Contrasts program),
version 1, a program for the calculation of
phylogenetically indepedent contrasts. This program calculates
contrasts for multiple trees, multiple datasets, or both.
It also returns a VCV matrix and the correlation matrix of the
independent contrasts. It is available as C source code and Windows
executables. It can be downloaded from
its web site
at http://anolis.oeb.harvard.edu/~liam/programs/
David Ackerly
(dackerly (at) stanford.edu)
of the Department of Biological Sciences, Stanford University, Stanford,
California has released ACAP 2 (Another Comparative Analysis
Program) to carry out independent contrasts methods for comparative
analysis. It also also incorporates linear parsimony methods into
the program, in order to calculate consistency indices for continuous
characters. The program is written in Think Pascal for Macintosh Mac OS
systems, and is available from
its web site
at http://www.stanford.edu/~dackerly/ACAP.html
as a Macintosh executable which will run on 68k Macintosh or PowerMacintosh
Mac OS computers.
Ehab Abouheif
(ehab.abouheif (at) staff.mcgill.ca) of
the Department of Biology, McGill University, Montréal, Québec
has written (together with J. Reeve)
Phylogenetic Independence version 2.0. It carries out
Abouheif's Test For Serial Independence (TFSI) on continuously valued
characters and his Runs Test on discretely valued characters. These are
described in his paper: Abouheif, E. 1999. A method to test the assumption of
phylogenetic independence in comparative data. Evolutionary Ecology
Research 1: 895-909. The program is available as Windows
and Linux executables at its web site
at http://ww2.biology.mcgill.ca/biology/faculty/abouheif/programs.html.
Mark Pagel and Andrew Meade
of the School of Biological Sciences,
of the University of Reading, Reading, U.K.
(m.pagel (at) reading.ac.uk)
have released BayesTraits, a Bayesian package of programs
to analyze state evolution of discrete and continuous traits. It is
performs analyses of trait evolution among groups of species for which a
phylogeny or sample of phylogenies is available. It incoporates their earlier
and separate programs Multistate, Discrete and Continuous. BayesTraits can be
applied to the analysis of traits that adopt a finite number of discrete
states, or to the analysis of continuously varying traits. Hypotheses can be
tested about models of evolution, about ancestral states and about
correlations among pairs of traits. Parts of the package are described in
these papers:
- Pagel, M., A. Meade and D. Barker. 2004.
Bayesian estimation of ancestral character states on phylogenies. Systematic Biology 53: 673-684.
- Pagel, M., and A.Meade. 2006. Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. American Naturalist 167: 808-825.
- Barker, D. and Pagel, M. 2005. Predicting functional gene links using phylogenetic-statistical analysis of whole genomes. PLoS Computational Biology 1: 24-31.
BayesTraits is available as Windows executables, Linux executables,
Powermac Mac OS X executables, and Intel Mac OS X executables. It can be
downloaded from
its web site
at http://www.evolution.rdg.ac.uk/BayesTraits.html
Aaron King and Marguerite Butler
of the Department of Ecology and Evolutionary Biology
of the University of Michigan, Ann Arbor, Michigan and the Department of Zoology of the Hawaii at Manoa
(aaron.king (at) umich.edu and mbutler (at) hawaii.edu)
have written OUCH
(Ornstein-Uhlenbeck models models for phylogenetic Comparative Hypotheses),
version 1.1-2, an R package using the Ornstein-Uhlenbeck model for comparative methods tests.
The package fits different versions of Ornstein-Uhlenbeck models to comparative
data, as described by Thomas Hansen. It is described in the paper:
Butler, M. A. and A. A. King, 2004. Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164: 683-695.
It is available as an R package, and requires the R statistical computing
enironment to be available. It can be downloaded from
its web site
at http://tsuga.biology.lsa.umich.edu/king/ouch/
Chunghau Lee and Todd Oakley
of the Ecology, Evolution and Marine Biology Department
of the University of California at Santa Barbara, Santa Barbara, California
(chunghaulee (at) gmail.com) and (oakley (at) lifesci.ucsb.edu)
have released CoMET
(COntinuous-character Model Evaluation and Testing),
version of 081306, a Mesquite module for computing likelihoods for a given tree with Brownian motion models. COMet is a module for the Mesquite Project. Give it a tree topology as a starting point and some phenotypic data, and CoMET tells you the likelihoods of the data evolving through nine different evolutionary models, including both gradualist and punctuational models.
It is described in the paper:
Lee, C., S. Blay, A. Ø. Mooers, A. Singh, and T. H. Oakley. 2006. CoMET: A Mesquite package for comparing models of continuous character evolution on phylogenies. Evolutionary Bioinformatics Online 2: 193-196.
CoMET requires the Mesquite Java
framework for phylogenies.
It is available as Java source code and Java executables. It can be downloaded from
its web site
at http://www.lifesci.ucsb.edu/eemb/labs/oakley/software/comet.htm
Brian O'Meara
of the Center for Population Biology
of the University of University of California, Davis.
(bcomeara (at) ucdavis.edu)
has released Brownie
version 2.0, a program for analyzing rates of continuous character evolution.
Brownie looks for substantial rate differences in different parts of a tree
using likelihood ratio tests and Akaike Information Criterion (AIC) statistics.
Brownie 2.0 can read Nexus tree and data files and can perform analyses across
a set of weighted input trees (such as trees weighted by posterior
probabilities or bootstrap values) in order to deal with tree uncertainty.
Brownie deals with subtrees (using the censored test of O'Meara et al) and asks
whether the subtrees differ in rates of evolution.
An earlier version, Brownie 1.0, is written a series of MATLAB routines and can be used on systems that have MATLAB installed.
It is described in the paper:
O'Meara, B.C., C. Ané, M.J. Sanderson, and P.C. Wainwright. 2006.
Testing for different rates of continuous trait evolution using likelihood. Evolution 60(5): 922-933.
It is available as C++ source code, Windows executables and Mac OS X universal executables. It can be downloaded from
its web site
at http://www.brianomeara.info/brownie/
Emanuel Paradis
(paradis (at) isem.univ-montp2.fr) of the Institut
des Sciences de l'Éolution
of the Université Montpellier II and the CNRS, Montpellier, France
has written APE, a package in the R statistical
and graphical language which carries out a variety of phylogeny analyses,
including computation of distances from sequences and gene frequencies,
comparative methods, analyses of diversification, computation of minimum
spanning trees, and estimation of rates of evolution and smoothing of rates
in neighboring lineages. Some code was contributed by a number of other
people, including Korbinian Strimmer, Julien Claude, Gangolf Jobb, Rainer Opgen-Rhein, Julien Dutheil, Yvonnick Noel, and Ben Bolker. APE is described in
a paper: Paradis E., J. Claude and K. Strimmer. 2004. APE: analyses of
phylogenetics and evolution in R language. Bioinformatics 20:
289-290. R, a clone of the more commercialized S language,
is available in Windows, Mac OS, and Linux versions. APE can be downloaded
from
its page atthe CRAN-R archive site of R programs at
http://cran.r-project.org/src/contrib/Descriptions/ape.html; it is described and some additional documentation links given at
its web site
at http://pbil.univ-lyon1.fr/R/ape/
Ramón Díaz-Uriarte, of the Bioinformatics Unit of the Spanish National Cancer Center (CNIO), Madrid, Spain (rdiaz (at) ligarto.org) and Theodore
Garland, Jr, of the Department of Biology of the University of California,
Riverside (garland (at) ucr.edu
have released PHYLOGR, version 1.0.4. This is a package of
programs in the R statistical language (and also available as a Windows
executable) to carry out comparative methods analyses, particularly ones
using the Generalized Least Squares method. The package is distributed
through the
web site of the Cran-R (Comprehensive R Archive Network) project
at http://cran.r-project.org/src/contrib/Descriptions/PHYLOGR.html
Ted Garland
of the Department of Biology
of the University of California at Riverside
(tgarland (at) citrus.ucr.edu)
distributes PHYSIG
(PHYlogenetic SIGnal),
a MATLAB package of modules for comparative methods analysis. PHYSIG is a
package of modules in MATLAB that includes tests for phylogenetic signal in
continuous-character data, tests for the attraction to the mean in an
Ornstein-Uhlenbeck model, and tests of evolutionary covariation between
characters by a Generalized Least Squares method. It can be obtained by
emailing Garland at the above address (it is not available by web download).
The methods are described in the paper:
Blomberg, S. P., T. Garland, Jr., and A. R. Ives. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution 57: 717-745.
It is available as a MATLAB package, which can be downloaded from
its web site
at http://biology.ucr.edu/people/faculty/Garland/PHYSIG.html
Dylan Schwilk
of the an ecologist with the Sequoia - Kings Canyon Research Station
of the U. S. Geological Service
(dylan (at) schwilk.org)
has written Cactus-Pie
(Python version of CACTUS: Comparative Analysis of Continuous Traits Using Statistics),
version 0.3.1, a program for comparative analysis of continuous traits.
Cactus-pie is a version of CACTUS written in Python. It consists of a front end program and a number of back-end modules. They include programs that calculate independent contrasts, provide the Divergence Order Test (DOT) and
the Synchronized Changes Test (SvS), assign branch lengths to a phylogeny according to several different possible algorithms, and label clades in a phylogeny as defined by a taxa list in an auxilary file.
It is available as Python script which can run on computers that have the
Python language, which is widely available on Linux and Unix systems and
is standard on Mac OS X. Cactus-Pie can be downloaded from
its web site
at http://www.pricklysoft.org/software/cactus-pie.html
Liam Revell
of the Department of Organismic and Evolutionary Biology
at Harvard University
(lrevell (at) fas.harvard.edu)
has released pcca
(Phylogenetic Canonical Correlation Analysis),
version Beta, a program for phylogenetic canonical correlation analysis. pcca
uses a phylogenetic tree and measurements for an arbitrary number of
continuous characters to perform a PGLS transformation of the data and then
calculate canonical scores, weights, and correlations, and conduct hypothesis
tests about the canonical correlations. Canonical correlation analysis (CCA)
is a procedure using two sets of variables and calculating the linear
combinations of each that have highest correlation with each other.
The program can also estimate a multivariate version of Pagel's lambda
transformation of time, and perform the analyses under that transformation.
It is available as Windows executables, Linux executables and Mac OS X universal executables. It can be downloaded from
its web site
at http://anolis.oeb.harvard.edu/~liam/programs/
David Posada (dposada (at) uvigo.es), of the
Universidad Vigo, Spain, and Taylor Maxwell and
Alan Templeton, of the Department of Biology of Washington University, Saint
Louis, Missouri (temple_a (at) biology.wustl.edu)
have written TreeScan, version 0.9. This is a program to
test the association of continuous quantitative characters with a
tree of haplottypes. It is described in a paper: Templeton, A. R.,
T. Maxwell, D. Posada, J. H. StengÄrd, . Boerwinkle, and C. F. Sing. 2005.
Tree scanning: a method for using haplotype trees in phenotype/genotype
association studies. Genetics 169: 441-453. Treescan is
provided in C source code, and also as a DOS and Windows executable or
a Mac OS X executable. It is available from
its web site
at http://darwin.uvigo.es/software/treescan.html
Dolph Schluter
(schluter (at) zoology.ubc.ca)
of the Department of Zoology and Biodiversity Research Centre
University of British Columbia, in Vancouver, Canada, has released
ANCML, a program which
estimates ancestor states for a continuous trait, and provides a
"standard error" for the marginal distribution of each estimate. The method is described in
Schluter, D., T. Price, A. Ø. Mooers and D. Ludwig. 1998. Likelihood of ancestor states in
adaptive radiation. Evolution 51: 1699-1711. The method assumes a Brownian motion model
for the evolution of the trait. ANCML was written by modifying the program
CONTRAST in PHYLIP version
3.5, and it uses similar input conventions.
ANCML is available from
its web page
at http://www.zoology.ubc.ca/~schluter/ancml.html. It
is available as generic C source code and as a DOS executable.
Bill Bruno
(billb (at) lanl.gov) of the
Theoretical Biology and Biophysics Group T10, Los Alamos Scientific
Laboratory, has produced RIND, (Reconstructed INDependence),
a program which takes a tree supplied by the user, or uses a distance
method of the users choosing (one which can be found in PHYLIP), and computes a maximum likelihood estimate of the
number of times each residue in aligned protein sequences was replaced
in each position. The method is described in: Bruno, W. J. 1996,
Modeling residue usage in aligned protein sequences via maximum likelihood
Molecular Biology and Evolution 13: 1368-1374.
RIND is available from
its web site
at http://www.t10.lanl.gov/billb/rind/ as C source code
for a Unix environment, and assumes that
PHYLIP is also installed.
Steve Woolley
, then of the Department of Computer Science, Brigham Young
University, Provo, Utah, together with Justin Johnson, Matthew Smith,
Keith Crandall, and David McClellan of the Department of Integrative
Biology (whose email address is David_McClellan (at) byu.edu)
of that university have released TreeSAAP version
3.2, (Selection on Amino Acid Properties),
a program that analyzes the frequency of change in various
properties of amino acids in sequences evolving on well-corroborated
phylogenies supplied by the user. TreeSAAP is given nucelotide sequences
and a user-defined tree, and optimizes the placement of changes
on the tree. It then calculates measure of departure of these changes from
models of uniform neutral substitution, with respect to different
properties of the amino acids. The methods are described in a paper:
Woolley, S., J. Johnson, M. J. Smith, K. A. Crandall, and D. A. McClellan.
2003. TreeSAAP: Selection on Amino Acid Properties using phylogenetic trees.
Bioinformatics 19: 671-672. The program is
a Java application. It is distributed as Windows executables,
Mac OS X executables, and as Java source code from
the McClellan lab software
web site at http://inbio.byu.edu/faculty/dam83/cdm/
Marcin Joachimiak
(marcin (at) compbio.berkeley.edu) of the
Computational Genomics Research Group at University of California, Berkeley
has released Jevtrace version 3.01. This is a Java
package to analyze the distribution of amino acid changes across a
phylogeny, and use protein structures to identify
based on their distribution in the tree and in the protein structure.
The program reads a multiple sequence alignment, a tree, and can also
read a PDB protein structure. It displays the tree on either a branch
length scale or a sequence similarity scale and allows the user to select
clades of interest. The residues conserved in these clades and differing
between them are then found and can then be viewed on the three-dimensional
structure of the protein. Jevtrace is described in
electronic publication:
Joachimiak, M. P. and F. E. Cohen. 2003.
JEvTrace: refinement and variations of the evolutionary trace in JAVA.
Genome Biology 3 (12)
http://genomebiology.com/2002/3/12/research/0077. The Jevtrace
home page is available at http://www.cmpharm.ucsf.edu/~marcinj/JEvTrace/
at Fred Cohen's lab at University of California, San Francisco, where the
program was written. The software cannot be directly downloaded from there:
the user should fill out the Academic Software License form and submit it.
Distribution is free for academic institutions (prices for others
are not stated). The user will be emailed a link to download the software.
Xun Gu, of the Department of Genetics, Development and Cell Biology
and the Center for Bioinformatics and Biological Statistics at
Iowa State University, Ames, Iowa (xgu (at) iastate.edu) and Kent Vander Velden, then at that
University, have released DIVERGE version 1.04. DIVERGE
reads protein sequences and either infers a tree by Neighbor-Joining or lets
you read in a tree that you supply. It then allows you to define two
clades in the tree, and tests whether the pattern of rates of change at
different sites differ in these two clades. The statistical method used,
which is a likelihood ratio test based in a probabilistic model, is given in
the paper: Gu, X. 1999. Statistical methods for testing functional divergence
after gene duplication. Molecular Biology and Evolution 16:
1664-1674. The program is downloadable as either Windows or Linux executables
from
the Gu laboratory software web site
at http://xgu.zool.iastate.edu/software.html
Kent Fiala
(fiala (at) ipass.net) (most recently of SAS Institute) produced CLINCH
(CLadistic INference by Compatibility of Hypothesized characters) version 6.2.
It is a general-purpose compatibility program capable of handling multiple
unordered states. It is available as a DOS executable, including FORTRAN
source code, from the
Digital Taxonomy web page at
http://www.geocities.com/RainForest/Vines/8695/software.html#Cladistics.
Mark Wilkinson,
of the Department of Zoology, The Natural History Museum, London, U.K.
(marw (at) nhm.ac.uk) has
written PICA, version 4.0, a package of programs for character weighting
and randomization tests for compatibility analysis for
0/1 or multistate characters. These carry out a variety of tests for
nonrandomly compatible characters and include methods developed by Sharkey,
Le Quesne, Meacham and Alroy. They include ability of process data that
reflect the splits method of Bandelt and Dress.
The programs are available as a package of DOS executables, from
his software
web site at
http://www.nhm.ac.uk/research-curation/projects/software/mwphylogeny.html .
Christopher Meacham, affiliated with the University
Herbarium, University of California, Berkeley
produces COMPROB, a Pascal
program to compute probabilities that characters would be compatible at random,
thus telling us which clique is "most surprising". He can be contacted as
meacham (at) socrates.berkeley.edu about receiving a copy. The program is free.
John Armstrong, Adrian Gibbs, R. Peakall and George Weiller,
(John.Armstrong (at) anu.edu.au) of Mark Gibbs's group
at the School of Botany and Zoology of the Australian
National University, Canberra, have produced RAPDistance
version 2.00, a package for DOS for computing distance matrices for RAPD
analyses. Version 1.04 is also available, it has slightly more
functionality but cannot handle data sets as large as version 2.00.
RAPDistance has a comprehensive range of options for creating data
files, editing them and using application programs to analyse them.
It can export data sets in format of several other packages.
RAPDistance is available free
on the
web at http://www.anu.edu.au/BoZo/software/index.html.
Peter Reeves and colleagues
of the School of
Molecular and Microbial Biosciences at Sydney University, Australia, have
produced MULTICOMP, a program for computing various distances from sequence
data. It can also do sequence format conversions, compute various descriptive
statistics on the sequences, and can submit the sequences to two programs
from PHYLIP. It is described in a
paper: Reeves, P. R., L. Farnell and R. Lan.
1994. MULTICOMP: a program for preparing sequence data for phylogenetic
analysis. Computer Applications in the Biosciences (CABIOS) 10:
281-284.
I do not know what computer systems it runs on; perhaps it
is a DOS program. Reeves may be
contacted at reeves (at) angis.usyd.oz.au for distribution information.

Stuart Ray of the Division of Infectious Diseases at the Johns Hopkins University School of Medicine, Baltimore, Maryland (sray (at) jhmi.edu) have produced NimbleTree
version 2.6, a program that submits data sets to PAUP* or PHYLIP.
NimbleTree reads a variety of different sequence alignment formats, and allows you to more easily submit the resulting data sets to PHYLIP or to PAUP*. For PAUP* you need to have that program already installed on your computer. Some PHYLIP source code from its version 3.5p is included in NimbleTree (with my agreement). (I am not quite sure which methods from PHYLIP are available in this program).
It is available as Windows executables.
It can be downloaded from http://sray.med.som.jhmi.edu/SCRoftware/nimbletree/
Microsat,
by Eric Minch, then of
the Department of Human Genetics, Stanford University, Stanford, California
(his email address is now eric.minch (at) lionbioscience.com)
is a program for calculating distances from
microsatellite data. It uses the methods developed by David Goldstein et. al.,
and presented in their papers of 1995 in Proceedings of the Natonal Academy
of Sciences USA
92: 6720-6727 and Genetics 139: 463-471. The distance is based on
the mean microsatellite array size, implementing the "Δμ" distance that they
defined, which corrects for within-population variability and provides a
distance that is independent of population size. It is available for free
from a page in
Luca Cavalli-Sforza's lab web site at
http://hpgl.stanford.edu/projects/microsat/.
The program is written in ANSI C. Source code is distributed, and
so are executables for DOS and Mac OS.
Daniel Dieringer (daniel.dieringer (at) i122server.vu-wien.ac.at)
and Christian Schlötterer of the
University of Vienna have produced MSA (MicroSatellite
Analyzer) version 3.15, a program for handling large microsatellite data
sets. It can calculate many descriptive statistics for these data sets,
can convert the data into a variety of file formats for other programs,
and can also calculate a variety of distance measures.
It is described in a paper: Dieringer, D., and C. Schlötterer. 2003.
Microsatellite analyser (MSA): a platform independent analysis tool for large
microsatellite data sets. Molecular Ecology Notes 3: 167-169.
MSA is available as source code in C plus executables for Linux,
Windows (and DOS), Mac OS, and Mac OS X from
its web site
at http://i122server.vu-wien.ac.at/MSA/MSA_download.html.
Georg Weiller
, of the Genomic Interactions Group at the Research School of
Biological Sciences, Australian National University,
Canberra, Australia (georg.weiller (at) anu.edu.au) has produced
DIPLOMO (DIstance PLOt MOnitor) version 1.03. It compares
different distance measures with each other
by displaying them as a scatter plot. It then helps one
instantly identify all individual comparisons within the plot. individual
taxa can be excluded or included in the plots, DIPLOMO enables you to see
whether different taxa have different mutational characteristics (such as
more having relatively more transitions in some lineages), and whether
different distance measures correlate. The program takes as input a file
with several different distance matrices. This file is in a simple format
which can readily be produced by editing distance matrices produced by other
packages. A program to compute the distance matrices is currently under
development. Although DIPLOMO is intended to be ported to multiple platforms
the current version runs on DOS on PC-compatibles. DIPLOMO is free; it
can be obtained from
its web site
at http://life.anu.edu.au/molecular/software/diplomo/.
It is described in
a publication: Weiller, G. F. and A. Gibbs. 1995. DIPLOMO: The tool for a
new type of evolutionary analysis. CABIOS 11: 535-40.
Oclair Prado (oclair (at) cpqd.com.br) and Fernando Van Zuben
(vonzuben (at) dca.fee.unicamp.br) of the Department of Computer Science of the University of Campinas,
Brazil released the Phylogenetic Tree Project (PTP) genetic
algorithms toolbox, version 1.0. It infers phylogenies by maximum likelihood
or a distance matrix method,
using an evolutionary computation strategy which represents a phylogeny by
a genotype, simulates natural selection to choose among phylogenies and
recombination to rearrange phylogenies. It is a Windows executable,
which can be downloaded by ftp from
file PTP.zip at ftp.dca.fee.unicamp.br in folder
/pub/docs/vonzuben/oclair.
Alan R. Lemmon
(alemmon (at) evotutor.org)
and Michel Milinkovich (mcmilink (at) ulb.ac.be) of the
Unit of Evolutionary Genetics, Institute of Molecular Biology and Medicine,
Free University of Brussels, Belgium
have written MetaPIGA (Phylogeny Inference using the MetaGA),
a program searching for maximum likelihood phylogenies using a genetic
algorithm with metapopulations. This allows different populations in the
genetic algorithm to arrive at different solutions, which can then be
combined by migration and recombination. It is said to make possibe
effective maximum likelihood inference of phylogenies for data sets with
hundreds of species.
MetaPIGA's approach is described in a paper: Lemmon, A. R. and M. C. Milinkovitch. 2001. The metapopulation genetic
algorithm: An efficient solution for the problem of large phylogeny estimation.
Proceedings of the National Academy of Sciences, USA 99:
10516-10521.
It is available as Java code from
its web site at http://www.ulb.ac.be/sciences/ueg/html_files/MetaPIGA.html.
Derrick Zwickl
of the National Evolutionary Synthesis Center
at Duke University, Durham, North Carolina
(zwickl (at) nescent.org)
released GARLI
(Genetic Algorithm for Rapid Likelihood Inference),
version 0.951, a program using a genetic algorithm to search for maximum likelihood phylogenies. GARLI uses a genetic algorithm to perform heuristic phylogenetic searches under the General Time Reversible (GTR) model of nucleotide substitution and its submodels, with or without gamma distributed rate heterogeneity and a proportion of invariant sites. Its likelihood computations are equivalent to those in PAUP*. It can read NEXUS and PHYLIP format sequence files. An MPI algorithm is included for use on parallel computing clusters (the parallel version seeks to perform a more thorough tree search and does not reduce runtimes.
The program was written when Zwickl was a graduate student at the
Department of Zoology of the University of Texas at Austin.
It is described in his thesis:
Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. dissertation, The University of Texas at Austin.
It is available as C++ source code, Windows executables and Mac OS X universal executables. It can be downloaded from
its web site
at http://www.bio.utexas.edu/faculty/antisense/garli/Garli.html
MUST 2000
, a package of sequence management programs,
was developed by Hervé Philippe (herve.philippe (at)
umontreal.ca), of the Departement de Biochimie
Université de Montréal, Québec. It
intended as complementary to existing phylogeny and alignment programs and can
produce output files in the formats of PHYLIP, PAUP*, Hennig86, and CLUSTAL. It
contains a variety of sequence input, editing, checking, and storage functions,
as well as a sequence editor and a phylogeny plotter. It also allows further
analyses of the results from these phylogeny programs.
The original version of MUST is described in a paper: Philippe, H. 1993.
MUST, a computer package of management utilities for sequences and trees.
Nucleic Acids Research 21: 5264-5272.
MUST 2000 is available as a Windows program from
a download page
at the University of Montpellier at
http://www.isem.univ-montp2.fr/PPP/PM/RES/Info/@Softwares.php#MUST2000.
Steve Smith
, formerly of the Harvard Genome Laboratory, has written
an X-Windows interactive sequence editor, GDE (Genetic Data Environment), version 2.2, which
allows the user to edit sequences and align them by hand, and to select subsets
of sites and sequences and call a variety of analysis proprams including
ClustalV and many of the PHYLIP 3.5 programs. The GDE 2.2 system will run on
Unix or Linux systems using the X windowing system. It also includes the
TreeTool tree-plotting program (see below). GDE has been described in two
papers:
- Smith, S. W., R. Overbeek, C. R. Woese, W. Gilbert, and P. M. Gillevet.
1994. The genetic data environment an expandable GUI for multiple sequence
analysis. Computer Applications in the Biosciences 10:
671-675.
- De Oliveira, T., R. Miller, M. Tarin, and S. Cassol. 2003.
An integrated genetic data environment (GDE)-based LINUX interface for analysis
of HIV-1 and other microbial sequences. Bioinformatics 19:
153-154.
GDE 2.2 is free and is available
as source code in C and as binaries for SunOS, Sun Solaris, and Linux on
a web page
at the IUBIO molecular biology software server
http://iubio.bio.indiana.edu/soft/molbio/unix/GDE/
and it is also available by ftp at the Pasteur Institute software archive at
ftp://ftp.pasteur.fr/pub/GenSoft/unix/evolution/GDE/.
An earlier version of GDE is available at the EBI Software Archive at
ftp://ftp.ebi.ac.uk/pub/software/unix/
GDE is also available in some other forms:
- At a web page at
Michigan State University at
http://www.msu.edu/~lintone/macgde/
MacGDE, is a MacOS X version that runs under X windows, ported by Tim
Littlejohn (tim (at) biolateralgroup.com) at BioLateral Group
in Sydney, Australia.
- At the Bioafrica web site, a
a download page
http://www.bioafrica.net/GDElinux/index.html
makes available GDElinux, the Linux port of GDE 2.2. It also has MacGDE
available for download.
GDE is the precursor of the SeqLab interface used with the (now unavailable) GCG Wisconsin Package of sequence search and analysis programs.
Brian Fristensky
of the Department of Plant Science
of the University of Manitoba, Winnipeg, Manitoba, Canada
(frist (at) cc.umanitoba.ca)
distributes BIRCH
(BIological Research Computer Hierarchy),
version 2.1, a sequence management and submission system with molecular
databases.
BIRCH consists of scripts and programs for a wide variety of analyses. It uses
the GDE system for sequence editing and
submission to programs. Some of the programs include many from
PHYLIP,
fastDNAml,
Phylo_win,
ATV,
and ClustalW. BIRCH when installed
also includes copies of molecular databases; the user is instructed on how
to download these from their distribution sites.
BIRCH is written in C and in Python and Java. It is distributed in source code and can be compiled on
Linux and Solaris. The programs are distributed as binaries. The
main BIRCH web site is at
http://home.cc.umanitoba.ca/~psgendb/.
It is described in the
book chapter: Fristensky, B. 1999. Building a multiuser sequence analysis facility using freeware. pp 131-145 In S. Misener and S. A. Krawetz. Methods in Molecular Biology, vol. 132: Bioinformatics Methods and Protocols. Humana Press, Totowa, NJ, USA.
BIRCH can be downloaded from
its web site
at http://www.umanitoba.ca/faculties/afs/plant_science/psgendb/FTP/BIRCH/download.html
Don Gilbert
(gilbertd (at) bio.indiana.edu) of the
Department of Biology of the University of Indiana, has written
SeqPup versions 0.9 and 0.6, a biological sequence editor and analysis program
usable on MacOS, Windows and Unix systems. It allows alignment of
sequences by hand and submission of selected parts of selected sequences to
phylogeny programs, as well as to network services such as BLAST. It is
available in a more
complete earlier version (0.7) written in C++, or a later (0.9) version
written in Java. The latter will work on all systems that have the
Java 1.1 runtime environment. It will not work on the Java 1.2 runtime
environment, so the earlier Java 1.1 environment needs to be installed as
well if the later version of Java is present. (I do not know whether
SeqPup works with later versions of Java).
The two versions of SeqPup can be obtained
a web page
at iubio.bio.indiana.edu, if one follows the Local Reference
link.
or by World Wide Web
at http://iubio.bio.indiana.edu/soft/molbio/seqpup. Versions
0.6 and 0.8 are also available
by ftp
from the software server at the Institut Pasteur at ftp.pasteur.fr
in directory pub/GenSoft/Macintosh/sequence_tools/.
GHE includes multiple sequence alignment with
ClustalW, and
phylogenetic analysis of alignments with the
fastDNAml and LSADT programs.
The earlier version (0.6) is available in C++, and
executables of it are available for MacOS (the PowerMac and 68K platforms),
Windows, and Unix X windows systems
including Sun Solaris, SGI Irix, Dec (HP/Compaq) Unix, Linux.
It is currently
the more complete version, since it can also run some
PHYLIP programs.
The C++ source code
for the earlier version
available by anonymous ftp at: iubio.bio.indiana.edu in
directory util/dclap/source/.
Matthias Wolf, Joachim Friedrich, Thomas Dandekar, and Tobias Müller
of the Department of Bioinformatics
at the the Biocenter, University of Würzburg, Germany
(joeMatthias.Wolf (at) biozentrum.uni-wuerzburg.de)
have written CBCAnalyzer
version 1.0.3, programs for inferring phylogenies based on compensatory base changes (CBCs).
The CBCAnalyzer (CBC = compensatory base change) is a custom written software toolbox consisting of three parts, CTTransform, CBCDetect, and CBCTree. CTTransform reads several CT-file formats (ct, RNAviz ct or Mac ct), and generates a so called "bracket-dot-bracket" format that specifies which sites are paired in an RNA structure. This typically is used as input for other tools such as RNAforester, RNAmovie or MARNA. The latter one creates a multiple alignment based on primary sequences and secondary structures that now can be used as input for CBCDetect. The count (distance) matrix of compensating changes obtained by CBCDetect is used as input for CBCTree that reconstructs a phylogram by using the BIONJ algorithm.
It is described in the paper:
Wolf, M., J. Friedrich, T. Dandekar and T. Müller. 2005. CBCAnalyzer: inferring phylogenies based on compensatory base changes in RNA secondary structures. In Silico Biology 5: 0027.
It is available as C++ source code which can be compiled on Linux, and as
Windows executables. It can be downloaded from
its web site
at http://cbcanalyzer.bioapps.biozentrum.uni-wuerzburg.de/
Wolfgang Ludwig and Oliver Strunk
of the Lehrstuhl für
Mikrobiologie of the Technische Universität München
(wolfgang.ludwig (at) biol.chemie.tu-muenchen.de) distribute
ARB, an environment for 16s/18s/23s ribosomal RNA sequence
data. It provides a windowing environment for building up databases of
RNA sequences, aligning them, and searching, editing, modifying, aligning,
profiling, and constructing trees. ARB uses its own RNA sequence databases
which are made available to ARB over the Web. For phylogenies it uses
programs from
PHYLIP and
fastDNAml, as well as its own
ARB Neighbor-Joining program. ARB is also incorporates a variety of other
sequence analysis software. It can handle large numbers of sequences and has
sophisticated tree drawing and manipulation. ARB is distributed as executables for
a variety of versions of Unix, requiring that the Motif library
be available. At the
moment these are: Solaris 5.x, SUN OS 4.1.x,, Silicon Graphics 5.0,
Linux for PC, and Digital OSF.
ARB is available from
its web site
at http://www.arb-home.de/ or
by ftp from ftp://ftp.mpi-bremen.de/molecol_p/arb/.
A Mac OS X port of ARB is available at the University of Melbourne at http://www.microbiol.unimelb.edu.au/micro/staff/mds/ARB_OSX/ARB_to_MacOSX.html.
Ian Holmes
of the Department of Bioengineering
of the University of California at Berkeley, Berkeley, Calfornia
(ihh (at) berkeley.edu)
has released DART
(Dna, Amino and Rna Tests),
a package of programs to do a variety of genomic, alignment, and RNA structure
inferences. Various programs in the DART package do phylogenetic alignment;
RNA structure prediction and multiple RNA alignment, stochastic grammars,
inference of evolutionary models, phylo-HMMs and phylo grammars,
reconstruction of ancestral sequences; and phylogenomics. It contains a number
of programs including
- stemloc for RNA alignment
- xrate for phylo-grammars
- phylocomposer
- other statistical alignment programs in the Handel package
The programs use statistical algorithms (MCMC, EM) to impute multiple
alignments, annotations and other unseen evolutionary parameters from sequence
data. All are based on stochastic grammars or state-machine models of sequence
mutation and natural selection. Many other people (listed at the DART web
site) have contributed to the package.
A large number of papers on the methods are listed at the DART web site.
It is available as C source code. It can be downloaded from
its web site
at http://evolution.gs.washington.edu/phylip.html
Andrew Smith, Thomas W. H. Lui and Elisabeth Tillier
of the Ontario Cancer Institute and the Department of Medical Biophysics
at the University of Toronto, Canada
(e.tillier (at) utoronto.ca)
have released rRNA phylogeny, a program package to infer
phylogenies from ribosomal RNA using a model of substitution that allows for
compensating substiutions at paired sites. The program makes use of a model
(the OTRNA model) of ribosomal RNA substitution that has different rates for
paired and unpaired sites, that reflect the lower probability of a compensated
substitution that maintains the pairing. The model is empirically tabulated
from rRNA sequences and used in a modified version of programs from
PHYLIP to infer phylogenies. The
package also allows distances to be computed from the OTRNA model for use in
distance matrix programs.
The methods are described in the paper:
Smith, A., T. W. H. Lui and E. R. M. Tillier, 2004. Empirical substitution
models for Ribosomal RNA Molecular Biology and Evolution 21:
419-427.
It is available as C source code, Windows executables and Linux executables.
It can be downloaded from
its web site
at http://www.uhnresearch.ca/labs/tillier/software.htm
Dick Hwang and Phil Green
of the Department of Genome Sciences
of the University of Washington, Seattle, Washington
(dhwang (at) u.washington.edu)
has released AMBIORE
(Applications for Mcmc Bayesian Inference Of Rates in Evolution),
version 1.00, which estimates rates of change in nucleotide sequences
depending on the two neighboring bases.. It implements a flexible and
computationally efficient Bayesian Markov chain Monte Carlo approach to
estimating rates in evolution given a sequence alignment and tree topology
relating the species. The evolutionary model allows substitution rates at a
site to depend on the two flanking nucleotides, the branch of the phylogenetic
tree, and position within a sequence.
The methods are described in the paper:
Hwang D. G. and P. Green. 2004. Bayesian Markov chain Monte Carlo sequence
analysis reveals varying neutral substitution patterns in mammalian evolution.
Proceedings of the National Academy of Sciences U.S.A. 101:
13994-14001.
It is available as C source code. It can be downloaded from
its web site
at http://www.phrap.org/othersoftware.html
Henrik Nilsson and Bjørn Ursing
of the Botanical Institute
at the Gøteborg University and the Center for Genomics and Bioinformatics at the Karolinska Institute, Stockholm, Sweden
(henrik.nilsson (at) botany.gu.se and bjorn.ursing (at) cgb.ki.se)
have produced galaxie, a a package of CGI scripts for sequence
identification through automated phylogenetic analysis. galaxie is a server,
but also makes its scripts available for download. It is intended for identification of fungal EST sequences. It uses
BLAST, ClustalW and
PHYLIP to find a set of best matches to the EST sequence, then make a phylogeny of these matches and the original sequence to help identify the sequence. The CGI scripts require that the user who installs them be familiar with such scripts and have a web server, a and also have BLAST, ClustalW and PHYLIP installed on their computer.
It is described in the paper:
Nilsson, R. H., K.-B. Larsson, B. M. Ursing. 2004. galaxie - a CGI script package for sequence identification through automated phylogenetic analysis. Bioinformatics 20: 1447-1452.
It is available as a package of Perl scripts. It can be downloaded from
its web site at http://galaxie.cgb.ki.se/
Tom Hall of Ibis Therapeutics, Carlsbad, Calfornia
(thall (at) isisph.com) has produced
BioEdit, version 7.0.9. This is a sequence editor with
many kinds of general molecular biology functions available (alignment,
BLAST searches, plasmid drawing, restriction mapping, sequence machine trace
viewing, etc.). For our purposes the feature worth mentioning is that
it comes with a number of existing phylogeny programs which can be
automatically run from within BioEdit. These are: TreeView,
fastDNAml, and six DNA and protein
programs from PHYLIP. BioEdit
is available as Windows95/98/NT executables from
its web site at
http://www.mbio.ncsu.edu/BioEdit/bioedit.html.
GeneStudio, Inc.
of Suwanee, Georgia (info (at) genestudio.com)
has released GeneStudio Pro a commercial package for
sequence analysis and sequence format conversion. (Incidentally, although the
Suwanee River is partly located in Georgia, the town of Suwanee is not
located "way down upon the Suwanee River" but is quite far from it). Included
is an Alignment Editor that can invoke a variety of phylogeny programs,
including fastDNAml,
some programs from PHYLIP,
Tree-puzzle
and TreeView. It can also
do many other functions such as sequence format conversion, BLAST searches,
and contig editing. GeneStudio Pro is for the Windows platform. A free
trial version is available. Prices are not given on their web site but are
available on request. For further information see the
GeneStudio Pro web site
at http://www.genestudio.com/genestudio.htm.

Stuart Ray, of the Division of Infectious
Diseases of the Department of Medicine at the Johns Hopkins University
School of Medicine, Baltimore, Maryland (sray (at) jhmi.edu) has produced Simplot, version 3.5.1. It is a Windows program that serves as a front end to either code
from PHYLIP or to PAUP*
and enables you to easily submit jobs to them.
SIMPLOT enables you to select
regions and do other forms of data selection. It can also carry out the
"bootscanning" method of detecting inconsistencies in trees in different
regions of a sequence, which can be a signal for recombination.
It is distributed as a Windows executable through
its web site
at http://sray.med.som.jhmi.edu/SCRoftware/simplot/
James J. Cai
of the Department of Biology
of Stanford University, Stanford, California
(jamescai (at) stanford.edu)
has written MBEToolbox
(Molecular Biology and Evolution Toolbox),
version 2.20, A MATLAB package to enable evolutionary biologists to analyze
and view DNA and protein sequences. MBEToolbox includes sequence manipulation
and statistics, evolutionary distance calculations, tree creation, a novel
window analysis method and a graphical user interface. version 2.0 added new
functions for phylogenetic analyses by the maximum likelihood method, analysis
of site-specific evolutionary rates, and algorithms to detect recombination.
It is described in the papers:
- Cai, J. J., D. K. Smith, X. Xia, and K.-Y. Yuen. 2005. MBEToolbox: a
Matlab toolbox for sequence data analysis in molecular biology and evolution.
BMC Bioinformatics 6: 64 (22Mar2005)
- Cai, J. J., D. K Smith, X. Xia, and K.-Y. Yuen. 2006. MBEToolbox 2.0: An
enhanced version of a MATLAB toolbox for Molecular Biology and Evolution.
Evolutionary Bioinformatics Online 2: 189-192.
It is available as a MATLAB package, and has versions for download to
Windows, Linux, and Sun Solaris systems. It requires MATLAB 6.2 or higher; the
parsimony and likelihood analyses also require programs from
PHYLIP. MBEToolbox can be downloaded
from its web site
at http://bioinformatics.org/mbetoolbox/
Salvador Ramirez
of the Departamento de Oceanografia
of the Universidad de Concepcion, Chile
(sram (at) profc.udec.cl)
has released Bosque Phylogenetic Analysis Software,
version 1.7.42, an integrated set of phylogenetic analysis software into a
complete graphical user interface. Bosque is a graphical program that
integrates console programs from the
PHYLIP package, from
TREE-PUZZLE, and from
MUSCLE.
As a graphical program it also includes sequence, alignment and tree editors
to ease the manipulation of the data, the execution of the programs and the
administration of the output result from the phylogenetic programs. It also
has the option to function as a client-server program where a Bosque server is
installed, thus allowing the remote execution of the mentioned phylogenetic
programs.
It is available as Windows executables and Linux executables. It can be
downloaded from
its web site
at http://bosque.udec.cl
John Archibald and Andrew Roger
, of the
Department of Biochemistry and Molecular Biology, Dalhousie University,
Halifax, Nova Scotia, Canada (johna (at) hades.biochem.dal.ca
and aroger (at) is.dal.ca)
have released Likewind version 1.0. This is a set of
three Perl scripts that allow the user to create the necessary
PAUP* blocks to drive a likelihood
analysis of differences in phylogenies betweend different parts of a
molecule. The object is to detect recombination, hybridization, or
other sources of conflicting phylogenetic signal. The utilities also
harvest the output from PAUP* and present it to the user as a log-likelihood
difference plot. This shows the log-likelihood difference in the window
of adjacent sites between a predefined tree and the best estimate from the
local information. The scripts need
Seq-Gen and PAUP* present in a
Linux environment. The method has been described in a paper:
Archibald, J. M. and A. J. Roger. 2002. Gene conversion and the evolution of
euryarchaeal chaperonins: a maximum likelihood-based method for detecting
conflicting phylogenetic signals. Journal of Molecular Evolution
55: 232-245. Likewind is available from
its web site at
http://rogerlab.biochemistryandmolecularbiology.dal.ca/Software/Software.htm#likewind
Christian Zmasek
, currently of the Genomics Institute
of the Novartis Research Foundation in San Diego, California
(czmasek (at) gnf.org) and Sean Eddy (eddy (at) genetics.wustl.edu)
of the Department of Genetics, Washington University, St. Louis, Missouri
have released FORESTER, version 1.92, a package of Java routines for
inferring gene function using the output of high-throughput genome sequencing.
It includes the RIO Resampled Inference of Orthologs method which searches
for orthologs in the PFAM database, the SDI
Speciation-Duplication Inference method, and ATV, a tree viewer. This can take
a tree in the standard Newick format (and also in an extension that has
additional information), and display it in various forms. These are
described in papers:
- Zmasek, C.M. and S. R. Eddy. 2001. ATV: display and
manipulation of annotated phylogenetic trees. Bioinformatics 17:
383-384.
- Zmasek, C. M. and S. R. Eddy. 2001. A simple algorithm to infer gene duplication and speciation events on a gene tree. Bioinformatics 17: 821-828.
- Zmasek, C. M. and S. R. Eddy. 2002. RIO: Analyzing proteomes
by automated phylogenomics using resampled inference of orthologs. BMC Bioinformatics 3: 14.
The programs (in Java) are available from
the Forester web site
at http://www.genetics.wustl.edu/eddy/forester/.
A RIO web server is available from this group; see
the list of servers in these pages.
Louxin Zhang
(lxzhang (at) krdl.org.sg) of the
The Internet Bioinformatics Group of the Internet Research and Development
Unit of the National University of Singapore has produced
WebPHYLIP, a web interface
for the PHYLIP package, which
can submit jobs to it. It can be obtained by e-mailing him at the
above address.
Biomatters, Ltd.
in Auckald, New Zealand
(sales (at) geneious.com)
has released Geneious
version 2.53, a Java package for sequence management, searching, editing and
phylogenies. Geneious is an integrated bioinformatics tool suite for
manipulating, finding, sharing, and exploring biological data such as DNA
sequences or proteins, phylogenies, 3D structure information, publications,
etc. It features tree-based progressive sequence alignment and phylogenetic
analysis by Neighbor-Joining and UPGMA, bootstapping, drawing
the phylogeny, access to
biological databases, BLAST, protein structure viewing, NCBI, EMBL, PubMed
auto-find, and more. It includes an API for creating your own plugins.
It is available as Java executables. Geneious is available for free;
that version cannot edit sequences or annotations, cannot run the multiple
alignment program ClustalW from within Geneious, and cannot do some other
tasks. A full version, Geneious Pro, is available commercially.
Free downloads of Geneious or purchases of Geneious Pro can be done from
the Biomatters web site
at http://www.biomatters.com.
Geneious Pro is available at a full price of $395 (and a student price
of $175) from Biomatters, Ltd. Discount prices for multiuser licenses
are also available. An additional plugin, MrBayesPlugin, enabling it to run
MrBayes is available.
Catherine Letondal
of the Institute Pasteur, Paris, France
(letondal (at) pasteur.fr)
distributes PISE
(Pasteur Institute Software Environment),
a tool to generate Web interfaces for Molecular Biology programs. PISE allows
the user to generate web page interfaces to many programs, including PHYLIP and the EMBOSS package. Web
interfaces for these are available in PISE. The web interfaces allow you to
rapidly create a web server for running these programs. PISE is used to
create the Pasteur Institute web server. PISE web servers allow the results
of one program to the "piped" to another program. The web server generated is
designed to be run on a Unix machine. It is described in the paper:
Letondal C. 2001. A Web interface generator for molecular biology programs in
Unix. Bioinformatics, 17(1): 73-82.
It is available as Perl scripts. It can be downloaded from
its web site
at http://www.pasteur.fr/recherche/unites/sis/Pise/
The MathWorks
of Natick, Massachusetts
have produced Bioinformatics Toolbox
version 3.1, a MATLAB toolbox for bioinformatics. It has many functions for
sequence analysis and microarray data, including multiple sequence alignment
and consensus sequences. For this listing, the relevant ones are that it
enables you to create and edit phylogenetic trees. You can calculate pairwise
distances between aligned or unaligned nucleotide or amino acid sequences
using a broad range of similarity metrics, such as Jukes-Cantor, p-distance,
alignment-score, or a user-defined distance method. Phylogenetic trees are
constructed using hierarchical linkage with a variety of techniques, including
neighbor joining, single and complete linkage, and UPGMA. Bioinformatics
Toolbox includes tools for weighting and rerooting trees, calculating
subtrees, and calculating canonical forms of trees. Through the graphical user
interface, you can prune, reorder, and rename branches; explore
distances; and read or write Newick-formatted files. You can also use the
annotation tools in MATLAB to create presentation-quality trees.
It is available as a MATLAB package. It is available from
MathWorks, Inc. The price is not available on the website without a login,
but there are Commercial, Academic, and Student prices. It us believed that
commercial licenses are about $1,000, academic licenses about $90. More
information and sales contact can be obtained from the product
web site
at http://www.mathworks.com/products/bioinfo/
Andrew Rambaut
of the Department of Zoology,
University of Oxford, (andrew.rambaut (at) zoo.ox.ac.uk)
has written Bi-De version 0.1,
to simulate the evolution of trees using various models of lineage
birth and death, and sampling lineages from among those extant.
It can simulate branching with or without regulation of the number of
lineages. It also allows the user to specify the relationship between
the number of lineages and the birth rate of lineages.
The program is available free for MacOS system 7.0 or later,
from
its web page at
the University of Oxford Zoology software site at
http://evolve.zoo.ox.ac.uk/software.html?name=Bi-De. Its
manual can also be viewed on-line at that site. Rambaut says that Bi-De
is considered obsolete software, having been almost completely superseded
by their later program Phyl-O-Gen.

Andrew Rambaut, of the
Department of Zoology, University of Oxford (andrew.rambaut (at) zoo.ox.ac.uk)
has written Phyl-O-Gen version 1.1, a tree simulation
program. It simulates phylogenies produced by a birth-death process.
It also has a mode that allows for multiple epdisodes of evolution and
multiple mass extinctions. It is available as generic source code for Unix
and as Mac OS, Mac OS X, and Windows executables. It is distributed from
its web page
at http://evolve.zoo.ox.ac.uk/software.html?id=phylogen.
Although Rambaut considers it to supersede his earlier program Bi-De,
my reading is that some features of the older program are not
yet available in the newer one.
Oliver Pybus and Andrew Rambaut of the Department of Zoology,
University of Oxford, (oliver.pybus (at) zoo.ox.ac.uk
and andrew.rambaut (at) zoo.ox.ac.uk)
has written Genie (GENealogical Interval Explorer)
version 3.0, a program for
the inference of demographic history from reconstructed phylogenies.
The methods it implements are described in the papers:
- Pybus, O., A. Rambaut
and P. Harvey. 2000. An integrated framework for the inference of viral
population history from reconstructed genealogies. Genetics
155: 1429-1437.
- Pybus, O. G., M. A. Charleston, S. Gupta, A. Rambaut, E. C. Holmes, and
P. H. Harvey. 2001. The epidemic behavior of the Hepatitis C Virus. Science 292: 2323-2325.
- Strimmer, K. and O. G. Pybus. 2001. Exploring the demographic history of DNA sequences using the generalised skyline plot. Molecular Biology and Evolution 18: 2298-2305.
It is considered by its authors to supersede the methods used in
Rambaut's program End-Epi, which has been withdrawn
from distribution. Genie is available from
its web site
at http://evolve.zoo.ox.ac.uk/software.html?id=genie
as Mac OS, Linux or Windows executables, and as C source code for
Unix.
Paul-Michael Agapow
(p.agapow (at) ic.ac.uk)
and Nick Isaac, then of the Department of Biology,
Imperial College, Silwood Park, U.K.
has released MacroCAIC version 1.0.1, which was developed from
CAIC, by Andy Purvis and Andrew Rambaut. MacroCAIC
uses phylogenies and data sets of character values to examine correlates of
species richness, i.e. which traits are associated with abnormal speciation.
The phylogeny is used to make these correlations independent. Traits may be
continuous or discrete, and not every trait value needs to be known for every
clade in the phylogeny. MacroCAIC has been described in a paper: Agapow, P.-M.
and N. J. B. Isaac. 2002. MacroCAIC: correlates of species richness.
Diversity & Distributions 8: 41-43. MacroCAIC is a PowerMac
and 68k Mac binary executable for Mac OS. It is available from
its web site
at http://www.bio.ic.ac.uk/evolve/software/macrocaic/index.html.
Andrew Rambaut of the Department of Zoology,
University of Oxford, (andrew.rambaut (at) zoo.ox.ac.uk) and
Nick Grassly of the Department of Infectious Disease Epidemiology of
Imperial College School of Medicine, St. Mary's Campus, London
(n.grassly (at) ic.ac.uk)
have written Seq-Gen (Sequence Generator), version 1.3.2,
a program that will simulate the evolution of nucleotide sequences
and protein sequences
along a phylogeny or multiple phylogenies, using common models of the
nucleotide or protein substitution process. A range
of models of molecular evolution are implemented
Nucleotide frequencies and other parameters of the model may be given
and site-specific rate heterogeneity may also be incorporated in a number
of ways. The models available are the Hasegawa, Kishino and Yano (HKY) model,
the Felsenstein F84 model, the general reversible model, the Kimura 2-parameter model, the Jukes-Cantor model and the Dayhoff PAM, JTT, Blosum62, mtREV,
and WAG amino acid substitution models. Rate heterogeneity among sites or
among the different positions within a codon can be specified.
Seq-Gen is described in a paper: Rambaut, A. and N. C. Grassly. 1997.
Seq-Gen: an application for the Monte Carlo simulation of DNA sequence
evolution along phylogenetic trees. Computer Applications in the
Biosciences ICABIOS) 13: 235-238.
A Mac OS executable is available, as well as source code
files for Unix systems. A Windows executable is available for an earlier
version, 1.3.1. These are available from
its Web page at
http://evolve.zoo.ox.ac.uk/software.html?name=Seq-Gen.
Tom Wilcox
has written SG Runner
version 2.0.1, a graphical user interface for running Seq-Gen. It is available as Powermac Mac OS X executables.
It can be downloaded from his software web site
at http://homepage.mac.com/tpwilcox/FileSharing15.html
Nick Grassly
of the Department of Infectious Disease Epidemiology of
Imperial College School of Medicine, St. Mary's Campus, London
(n.grassly (at) ic.ac.uk)
and Andrew Rambaut of the Department of Zoology,
University of Oxford, (andrew.rambaut (at) zoo.ox.ac.uk)
have written PSeq-Gen (Protein-Sequence Generator), version
1.0,
which will simulate the evolution of protein sequences along evolutionary trees.
Three common models of amino acid substitution are implemented (PAM, JTT,
and mREV), allow for user-defined amino acid frequencies. Site-specific rate
heterogeneity following a gamma distribution is allowed. The program can
handle multiple trees and produce multiple data sets.
PSeq-Gen is available from
its Web site
at http://evolve.zoo.ox.ac.uk/software.html?name=PSeq-Gen as
Unix source code and also as Mac OS executables. An online manual can also
be viewed at that site. PSeq-Gen is now largely superseded by Seq-Gen,
and although its site still available it is not being actively supported
at the Oxford site.
Nick Grassly (most recently of the Department of
Biological Sciences at Imperial College, Silwood Park, U.K.)
and Andrew Rambaut,
of the Department of Zoology, University of Oxford (andrew.rambaut (at)
zoo.ox.ac.uk) have released
Treevolve, version 1.32 and also Ptreevolve,
programs that simulate the evolution of DNA and protein sequences
respectively. The molecular sequences are simulated under coalescent models
with constant population size, or with exponential population size growth.
In addition different levels of recombination can be specified.
In Treevolve, it is also possible to have an island model of population
subdivision.
Treevolve and Ptreevolve are written in ANSI C and should compile on most Unix
systems. They are also available as Mac OS executables, and
a project file for the MetroWerks Codewarrior compiler is included
in the Macintosh archive. They can be obtained, and the manual of the
programs viewed, from
their web site
at http://evolve.zoo.ox.ac.uk/software.html?name=Treevolve.
Jens Stoye, Dirk Evers and Folker Meyer of the
Research Center for Interdisciplinary Studies on Structure Formation (FSPM) and
the Technische Fakultät of the University of Bielefeld, Germany
(j.stoye (at) dkfz-heidelberg.de, dirk (at) TechFak.Uni-Bielefeld.de, and folker (at) TechFak.Uni-Bielefeld.de) have released
ROSE, the Random model Of Sequence Evolution, version 1.3.
It simulates the evolution of DNA, RNA, or protein sequences on a randomly
generated tree, allowing insertions and deletions and substitution at
different rates at different sites as
well. It can also use a predefined tree that is input in standard format.
It can report ancestral sequences or sequences at the tips of the tree, and
it also keeps a record of the true multiple sequence alignment for comparison
with the results of multiple sequence alignment programs.
ROSE is described in the paper:
Stoye, J. D. Evers and F. Meyer. 1998. Rose: generating sequence families.
Bioinformatics 14: 157-163.
ROSE is available in source code at
its web site
at http://bibiserv.techfak.uni-bielefeld.de/rose/.
Version 1.0 is also available as binary executables for SunOS and for SGI Unix
by anonymous ftp
from ftp.Uni-Bielefeld.de in directory
pub/projects/techfak/pi/rose/.
ROSE is also available as
a server.
Lars Jermiin
of the School of Biological Sciences
of the University of Sydney, Australia (lars.jermiin (at) usyd.edu.au)
has written Hetero, version 1.0, a program to
simulate evolution of DNA sequences on four-species trees.
The program allows many different kinds of heterogeneity of processes and
rates, including different models of change on different branches. It gives
a number of different kinds of summaries of the properties of the resulting
sequences, as well as writing them to files for use by other programs.
It is described in a paper: Jermiin, L. S., S. Y. W. Ho, F. Ababneh, J. Robinson, and A. W. D. Larkum. 2003. Hetero: a program to simulate the evolution of DNA
on a four-taxon tree. Applied Bioinformatics 2: 159-163.
It is distributed as executables for Sun Solaris, for Windows, and for
Mac OS X from its web site
at http://www.bio.usyd.edu.au/~jermiin/hetero.htm. Source code
is also offered to those users who obtain a license for use of source
code from the book Numerical Recipes.
Andy Pang, Andrew Smith, Paulo Nuin and Elisabeth Tillier
of the Cancer Genomics and Proteomics Division of the Ontario Cancer Institute and the Department of Medical Biophysics
of the University of Toronto, Canada
(e.tillier (at) utoronto.ca)
have released Simprot
(SIMulation of PROTeins),
version 1.01, a program to simulate protein evolution by substitution, insertion and deletion. It allows for several models of amino acid substitution (PAM, JTT and PMB), it allows for gamma distributed sites rates according to Yang's model, and it implements a parameterised Qian and Goldstein distribution model for insertion and deletion.
It is described in the paper:
Pang, A., A. Smith., P. Nuin, and E. R. M. Tillier. 2005. SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution
BMC Bioinformatics 6: 236.
It is available as C source code and Windows executables. It can be downloaded from
its web site
at http://www.uhnresearch.ca/labs/tillier/software.htm
Cory Strope, S. D. Scott and Etsuko Moriyama
of the Department of Computer Science and School of Biological Sciences
of the University of Nebraska
(cstrope (at) cse.unl.edu)
have released indel-Seq-Gen
version 1.0, a sequence family simulator incorporating domains, motifs, and
indels. it simulates a more realistic evolutionary
process of protein sequences including insertions and deletions (indels).
iSG allows the user to simulate multiple subsequences according to
different evolutionary parameters, which is necessary for generating realistic
protein families with multiple domains. It tracks all evolutionary events
including indels and
outputs the "true" multiple alignment of the simulated sequences. iSG can also
generate a larger sequence space by allowing the use of multiple related root
sequences. It is intended to be used to test the accuracy of multiple alignment
methods, phylogenetic methods, evolutionary hypotheses, ancestral protein
reconstruction methods, and protein family classification methods.
It is described in the paper:
Strope, C. L., S. D. Scott, and E. N. Moriyama. 2007. indel-Seq-Gen: A new
protein family simulator incorporating domains, motifs, and indels.
Molecular Biology and Evolution 24: 640-649.
It is available as C source code, Perl script, Linux executables and
Intel Mac OS X executables. It can be downloaded from
its web site
at http://bioinfolab.unl.edu/~cstrope/iSG
Dmitry Filatov
, of the Department of Plant Sciences of
the University of Oxford, U.K.
(dmitry.filatov
(at) plants.ox.ac.uk) has
released ProSeq (PROcessor of SEQuences) version 2.9.
ProSeq is a sequence-editing environment that can do sequence alignment
editing, translation, detection of polymorphic sites, and
a variety of tests, many of a population-genetic nature, for neutrality
and recombination. The part of its capabilities that are relevant to this
listing is that it can simulate the evolution of a set of DNA sequences
along a coalescent tree, with or without recombination. It is described
in a paper: Filatov, D.A. 2002. ProSeq: A software for preparation and
evolutionary analysis of DNA sequence data sets.
Molecular Ecology Notes 2: 621-624.
ProSeq is a Windows program available from
its web site at
http://dps.plants.ox.ac.uk/sequencing/proseq.htm.
Paul Michael Agapow
(p.agapow (at) ic.ac.uk), of the
Department of Biology, Imperial College, Silwood Park, U.K. has released
MESA (MacroEvolutionary analysis and SimulAtion),
version 1.9.23, a program to simulate evolution of a
group, allowing for a variety of kinds of extinction mechanisms.
It can simulate evolution and describe the diversity of the resulting
groups. MESA is available as an executable for Windows and as an executable
for Mac OS X from
its web site
at http://www.agapow.net/software/mesa
Reed Cartwright
of the Bioinformatics Research Center
at North Carolina State University, Raleigh, North Carolina
(racartwr (at) ncsu.edu)
has written DAWG (DNA Assembly With Gaps),
version 1.1, a program to simulate evolution of DNA sequences with
recombination and gaps. It is designed to simulate the evolution of
recombinant DNA sequences in continuous time based on the general time
reversible model with gamma and invariant rate heterogeneity and a novel
length-dependent model of gap formation. It accepts phylogenies in Newick
format and can return the sequence of any node, allowing for the exact
evolutionary history to be recorded at the discretion of users. Dawg records
the gap history of every lineage to produce the true alignment in the output.
Many options are available to allow users to customize their simulations and
results. It is described in the paper:
Cartwright, R. 2005. DNA Assembly with gaps (DAWG): simulating sequence evolution. Bioinformatics 21 (Supplement 3): iii31-iii38.
It is available as C source code. It can be downloaded from
its web site
at http://scit.us/projects/dawg/wiki
Barry Hall
of the Bellingham Research Institute in Bellingham, Washington
(barryhall (at) zeninternet.com)
has written EvolveAGene3, a program that simulates evolution
of a protein sequence along a tree. It generates a bifurcating tree, and
assigns branch lengths from a distribution whose mean is specified by the
user. A protein sequence is evolved along this tree, with deletions and
insertions of codons and with base subtitutions. Substitutions that change
the amino acid are accepted with a specified probability.
This includes having variable regions of selection
intensity, positive or purifying, within the sequence and variation
in intensity of selection over branches.
Output includes the true tree and unaligned coding sequences and protein
sequences as well as the true DNA and the true protein alignments.
It is available as Perl script, Windows executables and Mac OS X universal
executables. See
the web site http://homepage.mac.com/barryghall/Software.html
Robert Beiko and Robert Charlebois
of the Faculty of Computer Science at Dalhousie University
(beiko (at) cs.dal.ca)
have released EvolSimulator
version 2.10, a program to simulate the evolution of genes and genomes.
EvolSimulator was created to allow the simulation of complex evolutionary
regimes, potentially comprising nonstationary and nonuniform processes at the
sequence level, alongside genome-level processes such as gene duplication,
gene loss, and lateral gene transfer. Several models of LGT are implemented,
including random transfers, transfers that favour close relatives in the
organismal tree, and preferential transfer among members of the same habitat.
It is described in the paper:
Beiko, R. G. and R. L. Charlebois. 2007. A simulation test bed for hypotheses
of genome evolution. Bioinformatics 23 (7): 825-31.
It is available as C++ source code. It can be downloaded from
its web site
at http://bioinformatics.org.au/evolsim
Martin Senger
(senger (at) ebi.ac.uk) of the European Bioinformatics Institute, Hinxton, U.K. and Peter
Ernst (P.Ernst (at) dkfz-heidelberg.de) of
the Deutsches Krebsforschungszentrum, Heidelberg, Germany distribute
W2H, version 4.1.1, a web interface for running molecular
sequence analysis programs. It can invoke a variety of sequence analysis
programs, including the EMBOSS Embassy versions of a number of
PHYLIP 3.5 programs. W2H is
a set of web pages that has been developed primarily with Unix platforms in
mind. They are available from
the W2H web page at http://www.w2h.dkfz-heidelberg.de/.
Juan José de Haro
(jjdeharo (at) terra.es) of "una pequeña
ciudad" of the province of Barcelona, Spain, has released
Phyledit version 2.0. This is an interactive data
editing and analysis program
that uses PHYLIP and
Treeview for analysis and
display of trees. It uses eight programs from PHYLIP, all concerned with
discrete 0/1 characters. Phyledit's menus, responses and documentation are
all in Spanish. It is a Windows executable downloadable available from
its web site
at http://www.terra.es/personal7/jjdeharo/phyledit/. It
is available there either by itself or with the PHYLIP and TREEVIEW
executables as well.
Øyvind Hammer
, of the
Paleontological Museum of the University of Oslo, Norway
(ohammer (at) nhm.uio.no) has written PAST
(PAleontological STatistics), version 1.31, a package which carries out many
kinds of paleontological
data analyses, including stratigraphic and morphometric statistics. It
also does parsimony analysis, including exhaustive, branch-and-bound and
heuristic algorithms for Wagner, Fitch and Dollo parsimony. It does bootstrap
methods, strict and majority rule consensus trees, and consistency and
retention indices. It calculates three stratigraphic congruency indices with
permutation tests. PAST is described in a paper: Hammer, Ø.,
D.A.T. Harper, and P. D. Ryan. 2001. PAST: Paleontological statistics
software package for
education and data analysis. Palaeontologia Electronica 4:,
issue 1:
http://palaeo-electronica.org/2001_1/past/issue1_01.htm.
PAST is available from
its web site at http://folk.uio.no/ohammer/past/index.html
as a Windows executable. Manuals can be read online or downloaded from the
web site.
Matthew Wills
of the Department of Biology
at the the University of Bath,. U.K.
(bssmaw (at) bath.ac.uk)
has produced GHOSTS
version 2.4, a program for significance tests for RCI, SCI and GER values by randomization.
It calculates the Relative Completeness Index (RCI), the Stratigraphic
Consistency Index (SCI), and the Gap Excess Ratio (GER) measures of
consistency of phylogenies with the stratigraphic record if given one or more
tree topologies and stratigraphic range data for up to 74 terminal taxa. It
can be used with NEXUS tree files and can be used interactively with MacClade. It can also randomly permute
the assignment of stratigraphic ranges among taxa, while holding tree
topologies constant, to yield a distribution of values. It tests whether the
RCI and SCI values for the original data differ significantly from the random
distributions. Its methods are described in the paper:
Wills, M. A. 1999. Congruence between phylogeny and stratigraphy: randomization tests. Systematic Biology 48 559-580.
It is available as source code for use with Chipmunk Basic. It can be
downloaded from
its web site
at http://palaeo.gly.bris.ac.uk/cladestrat/Gho2.html
Andrew Rambaut
(andrew.rambaut (at) zoo.ox.ac.uk)
of the Department of Zoology, University of Oxford,
has released QDate version 1.1.1. QDate
estimates the date of divergence between two pairs of sequences given
that the date of divergence of the members of each pairs is known.
It analyzes the data under three models: (1) a perfectly clocklike model,
(2) a model in which one pair has a different rate of divergence than the
other, and (3) a model in which all branches have different rates.
The method is described in the paper: Rambaut, A., and L. Bromham. 1998.
Estimating divergence dates from molecular sequences. Molecular
Biology and Evolution 15: 442-448. QDate is available from
its web site at
http://evolve.zoo.ox.ac.uk/software.html?id=qdate. It is available
as C source code for Unix or as a Macintosh or Windows executable.
Andrew Rambaut
,
of the Department of Zoology, University of Oxford
(andrew.rambaut (at) zoo.ox.ac.uk)
has written TipDate version 1.2.
TipDate is an application for estimating the rate molecular evolution
(and hence a time-scale) for a
phylogeny consisting of dated tips. These will most frequently be from viruses or other
fast-evolving pathogens that have been isolated over a range of dates. The program can also return
the likelihood for the simple molecular clock model (i.e., assuming that all sequences are
contemporary), for a model in which rates of change at different times are
drawn from a distribution, or the non-clock model. These are useful for
likelihood ratio tests of the fit of the model to the data.
TopDate is described in a paper: Rambaut, A. 2000. Estimating the rate of
molecular evolution: incorporating non-contemporaneous sequences into maximum
likelihood phylogenies. Bioinformatics 16: 395-399.
TipDate is available as Mac OS executables and as source code for
Linux or Unix from
its web site
at http://evolve.zoo.ox.ac.uk/software.html?name=TipDate.
It is also available in a web-based server version from the
Pasteur Institute server.
Emanuel Paradis
(paradis (at) isem.univ-montp2.fr), of the
Institut des Sciences de
l'Évolution of the Université Montpellier II and the CNRS,
Montpellier, France has released Diversi version
0.20.0, a program for
the analysis of diversification using phylogenetic data. It uses
several methods to estimate and test for variations in
diversification rates using phylogenetic data, including
tests for temporal or among-clade variations in diversification rates
using a maximum likelihood method. The program takes divergence times as
its input. It can also simulate the branching of trees.
he tests are described in a paper:
Paradis, E. 1997. Assessing temporal variations in diversification rates from
phylogenies: estimation and hypothesis testing. Proceedings of the Royal
Society of London B 264: 1141-1147.
It is available as FORTRAN source code and also as a Windows executable, from
Paradis's software web page
at http://www.isem.univ-montp2.fr/PPP/PPPphylogenie/ParadisHome.php#softwares or
by ftp from evol.isem.univ-montp2.fr in
directory /pub/pc/Log-manu. Paradis describes DIVERSI as
no longer developed or maintained, and soon to be superseded by his later
program APE.
Marc Robinson-Rechavi
, of the Laboratoire de Biologie
Moléculaire et Cellulaire of the École Normale Supérieure de Lyon, France
(marc.robinson (at) ens-lyon.fr) has written RRTree,
(Relative Rate tests within a Tree), version 1.1. It carries out relative rate tests
for equality of evolutionary rates in DNA or protein sequences between lineages,
taking into account the structure of the tree, which can be input in a number of
common formats. In addition sequences are read in.
The methods are described in a papers:
- Robinson, M., M. Gouy, C. Gautier, and D. Mouchiroud. 1998. Sensitivity of
the relative-rate test to taxonomic sampling, Molecular Biology and
Evolution 15: 1091-1098.
- Robinson-Rechavi, M. and D. Huchon. .2000. RRTree: Relative-rate tests
between groups of sequences on a phylogenetic tree. Bioinformatics
16: 296-297.
RRtree is available as C source code and as executables for
Windows, Mac OS, and for SGI or Sun Solaris Unix systems from
its web page at
http://pbil.univ-lyon1.fr/software/rrtree.html or
by
anonymous ftp from ftp://pbil.univ-lyon1.fr in