To go to top of the Software pages

To previous part of Software pages


Jun Adachi and Masami Hasegawa have written a package MOLPHY, version 2.3b3, carrying out maximum likelihood inference of phylogenies for either nucleotide sequences or protein sequences. Their protein sequence maximum likelihood program, ProtML, is a successor to the one they made available to me and which I formerly distributed on a nonsupported basis in PHYLIP. The package is distributed free in C source code, with documentation. MOLPHY is available from its web site from http://www.ism.ac.jp/ismlib/softother.e.html   A monograph describing MOLPHY (number 28 in the Computer Science Monographs of the Institute of Statistical Mathematics) is available from the same source (see folder csm96 on the distribution web page), as TeX source and as a .dvi file. The monograph can also be ordered from the Institute. An executable version of MOLPHY 2.2 for Windows95 or Windows NT on Intel processors, and also one that works on Windows NT on DEC Alpha processors, is available from Russell Malmberg at the Botany Department of the University of Georgia (russell  (at) plantbio.uga.edu) at his software web site at http://www.plantbio.uga.edu/~russell/software.html


Gary Olsen, of the Department of Microbiology, University of Illinois, Urbana, Illinois (gary  (at) phylo.life.uiuc.edu) has developed a speeded-up replacement for my program DNAML coded in C, called fastDNAml version 1.2.2. It achieves a number of economies and also is organized so that it can be run on parallel processors -- he and his co-workers have constructed trees of very large size on a high-speed parallel processor. The program can be compiled using the "p4" portable parallel processing toolkit. It can also be run in ordinary serial mode on workstations where it is faster than DNAML. fastDNAml is described in a paper: Olsen, G. J., H. Matsuda, R. Hagstrom, and R. Overbeek. 1994. fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Computer Applications in the Biosciences (CABIOS) 10: 41-48. It is available in the following places:


Denis Beaumont (beaumont  (at) transpac.atlas.fr) has made a parallelized version of fastDNAml called VeryfastDNAml. It is parallelized with the TreadMarks distributed shared memory system, which is a not-quite-free environment for parallelization that runs on many workstation-class machines. The C source code of VeryfastDNAml is available by ftp from the Institut Pasteur server ftp.pasteur.fr in directory /pub/GenSoft/unix/evolution/FastDNAml as file fastDNAml-tmk.tar.gz. There is a web page access to this ftp distribution at http://bioweb.pasteur.fr/seqanal/soft-pasteur.html#veryfastdnaml, which includes a link to the TreadMarks project.


Bette Korber of the Theoretical Division, Los Alamos National Laboratory , Los Alamos, New Mexico (btk  (at) t10.lanl.gov) and her colleagues have released a version of fastDNAml which uses the REV (general reversible) model of DNA evolution. They used it for the results in the paper: B. Korber, M. Muldoon, J. Theiler, F. Gao, R. Gupta, A. Lapedes, B. H. Hahn, S. Wolinksy and T. Bhattacharya. 2000. Timing the ancestor of the HIV-1 pandemic strains. Science 288: 1789-1796. The program is available both in a version using the MPI Message-Passing Interface for parallel computers or a non-parallel version. It is available as C source code for Unix from the web site for the programs from that paper at http://www.santafe.edu/~btk/science-paper/bette.html.


Alexandros Stamatakis (Alexandros.Stamatakis  (at) epfl.ch) of the Laboratory for Computational Biology and Bioinformatics (LCBB) of the École Polytechnique Fédérale de Lausanne, Switzerland and colleagues have released several programs for faster reconstruction of phylogenies by maximum likelihood. These provide faster heuristic search, use of parallel processing, and a simulated annealing algorithm, in various combinations. They include:

There are a number of papers describing these programs:

The programs are available as C source code, Windows executables, and Mac OS X executables at Stamatakis's software web page at http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm.


Thomas Keane (thomas.m.keane (at) nuim.ie) and Thomas Naughton (tom.naughton (at) nuim.ie), both of the Department of Computer Science of the National University of Ireland, Maynooth have released DPRML, a distributed cross-platform tree-building program that can use the idle clock cycles of machines, allowing idle time on hundreds of machines to be harnessed for tree-building. It uses the PAL Java framework. It is described in a paper: Keane, T.M., T. J. Naughton, S. A. A. Travers, J. O. McInerney, and G. P. McCormack. 2005. DPRml: Distributed Phylogeny Reconstruction by Maximum Likelihood. Bioinformatics 21: 969-974. DPRML can be downloaded from its web page at http://www.cs.nuim.ie/distributed/dprml.php Its authors note that it is slower than their more recent distributed phylogeny platform MULTIPHYL, and they urge use of that instead of DPRML.


T. M. Keane, T.J. Naughton, S.A.A. Travers, J.O. McInerney, and G.P. McCormack, of the Department of Computer Science at the the National University of Ireland, Maynooth, Ireland (tkeane  (at) cs.nuim.ie ) have produced MultiPhyl, version 1.06, a distributed phylogeny platform enabling maximum likelihood runs across a large number of heterogeneous machines. MultiPhyl is a high-throughput implementation of a distributed phylogenetics platform that is capable of using the idle computational resources of many heteogeneous non-dedicated machines to form a phylogenetics supercomputer. It allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching, and bootstrapping of each of the alignments. The program implements a set of 80 amino acid models and 56 nucleotide ML models and a variety of statistical methods for choosing between alternative models. It is described in the paper: Keane, T.M., T.J. Naughton, S.A.A. Travers, J.O. McInerney, and G.P. McCormack. 2005. DPRml: Distributed Phylogeny Reconstruction by Maximum Likelihood. Bioinformatics 21(7): 969-974. It is available as Java code. It can be downloaded from http://www.cs.may.ie/distributed/multiphyl.php  It runs on the distributed Java-based platform produced by this group, which allows jobs to be run across multiple machines. That platform is available from them at its web site at http://www.cs.may.ie/distributed/downloads.php  Multiphyl can also be tested by using their web server version.


Ziheng Yang of the Department of Genetics and Biometry, University College London, (z.yang  (at) ucl.ac.uk) has released PAML, version 3.14, a package of programs for the maximum likelihood analysis of nucleotide or protein sequences, including codon-based methods that take into account both amino acids and nucleotides. The programs can estimate branch lengths in a phylogenetic tree and parameters in the evolutionary model such as the transition/transversion rate ratio, the gamma parameter for variable substitution rates among sites, rate parameters for different genes, and synonymous and nonsynonymous substitution rates. They can also test evolutionary models, calculate substitution rates at particular sites, reconstruct ancestral nucleotide or amino acid sequences, simulate DNA and protein sequence evolution, compute distances based on the synonymous and nonsynonymous changes, and of course do phylogenetic tree reconstruction by maximum likelihood and Bayesian Markov Chain Monte Carlo methods. The strength of the package lies in its rich implementation of evolutionary models, though Yang coments that tree-making is not a strong point of the current version. Another notable point is the availability of codon models, which Yang pioneered. The package is available as Windows executables and as C source code for Unix and MacOS X systems. An Old Versions folder in the ftp site that distributes these also contains Mac OS executables for the earlier versions 3.0a and 3.0c. See the PAML web page at http://abacus.gene.ucl.ac.uk/software/paml.html where it is available. The Bioinformatics and Expression Analysis Core Facility at the Carolinska Institute in Stockholm has also made available a Red Hat Linux RPM package of PAML 3.14 at their Biorpms web pages at http://apt.bea.ki.se/packages.html.


Amy Egan and Joana Silva of The Institute for Genomic Research (TIGR) in Rockville, Maryland (aegan (at) jcvi.org) have produced IDEA (Interactive Display for Evolutionary Analysis), version 2.4, a graphical interface for PAML. IDEA allows you to run either of the PAML programs codeml or baseml on a single dataset or on multiple datasets simultaneously. They allow you to obtain maximum likelihood estimates of numbers of substitutions per branch and per site and to compare multiple models of molecular evolution given the data and a phylogenetic tree for the sequences. You can optionally generate phylogenies with PHYLIP, using maximum parsimony (on small datasets) or neighbor-joining. IDEA can perform multiple runs of codeml with different starting (dN/dS) values and merge their results for increased accuracy. It can also analyze multiple datasets in parallel and save parameter values for future use, and monitor progress step by step. For codeml analyses of sites-based evolutionary models features an interactive tabular summary of results, visualizations of selective pressure along genes, interactive histograms and depictions of phylogenetic trees with branch lengths proportional to the estimated number of nucleotide substitutions. It is available as a combination of Perl script, Java executables and Linux or Solaris executables. It can be run on systems that have Perl, Java, PAML 3.14 or 3.15, and PHYLIP. If parallel execution is desired you need to have SGE or Condor, otherwise it will just run on the machine on which it is launched. It can be downloaded from its web site at http://ideanalyses.sourceforge.net/main.html


Federico Hoffmann and Juan Opazo of the School of Biological Sciences of the University of Nebraska, Lincoln, Nebraska (federico (at) biokubuntu.com and jopazo (at) biokubuntu.com) have written Codeml3X, a script that runs Codeml three times. It runs CODEML from PAML three times in a row, with three different starting omega values. The script will create a directory and three subdirectories where the results of each run will be saved, and it will also create a text file with the likelihood scores of each tree for each run. It is available as Perl script. It can be downloaded from its web site at http://www.biokubuntu.com/enlaces.html


Tim Massingham and Nick Goldman of the Eurpean Bioinformatics Institute in Hinxton, U. K. (timm (at) ebi.ac.uk and goldman (at) ebi.ac.uk) have produced SLR (Sitewise Likelihood Ratio), a program to compute and test the nonsynonymous/synonymous ratio of substitutions at each site. For coding sequences it makes a maximum likelihood estimate for each amino acid position of the ratio of nonsynonymous substitutions to synonymous substitutions, and does a likelihood ratio test for that site. The many sitewise tests are then corrected for multiple comparisons to indicate which sites have strong evidence of purifying or positive selection and so whether there is any reliable evidence for the presence of selection in the alignment. Alternatively SLR can restricted to only detect unusually variable sites, indicating such sites and providing evidence for the presence of positive selection in the alignment. It is described in the paper: Massingham, T. and N. Goldman. 2005. Detecting amino acid sites under positive selection and purifying selection. Genetics 169: 1853-1762. It is available as C source code, Windows executables, Linux executables and Powermac Mac OS X executables. It can be downloaded from its web site at http://www.ebi.ac.uk/goldman/SLR/


  Gangolf Jobb (gangolf (at) treefinder.de), formerly of the Institut für Statistik of the University of München, Germany, has produced Treefinder, a maximum likelihood program for nucleotide sequence data. It makes available a variety of models of base change, including codon-position-specific models. It carries out search for best trees by its own method of tree rearrangement, and can assess statistical support for groups by either bootstrap or a local paired-sites method. All parameters of the models can be optimized by searching for the values that maximize the likelihood. The program is fast, and has both a graphical user interface and a general language in which its operation can be programmed. Trees can be interactively manipulated and constrained in various ways. Treefinder is described in a paper: Jobb, G., A. von Haeseler, and K. Strimmer. 2004. TREEFINDER: A powerful graphical analysis environment for molecular phylogenetics. BMC Evolutionary Biology 4: 18. It can de downloaded from its web site at http://www.treefinder.de as executables for Windows, Mac OS X, Linux, or Sun Solaris. It requires the Java runtime environment to be present.


Stéphane Guindon (currently at the University of Auckland, New Zealand, s.guindon  (at) auckland.ac.nz) and Olivier Gascuel (gascuel (at) lirmm.fr) at the LIRMM, of the CNRS and the University of Montpellier II, France, have released PHYML version 2.4.3, a fast maximum likelihood program for nucleotide or protein sequence data. It has 6 possible DNA substition models, 5 amino acid substitution models, allowing estimation of many of the model parameters, and can allow for a gamma distribution of rates among sites and a proportion of invariable sites. It can also do bootstrapping of the trees. PHYDIT is described in a paper: Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology 52: 696-704. It is available as Linux, SunOS, Windows, and Mac OS X executables from its web site in Montpellier at http://atgc.lirmm.fr/phyml/, where it is also available as a web server. The source code is available from Guindon by email.


Johan Nylander (Johan.Nylander  (at) abc.se) has written BootPHYML version 3.4. This is a Perl script that performs bootstrapping using programs from PHYLIP , substituting PHYML for the PHYLIP program DNAML. It works with Mac OS X and Linux or Unix. It is available through Nylander's software download site at http://www.abc.se/~nylander/ in Sweden.


Laura Salter Kubatko of the Departments of Statistics and Evolution, Ecology, and Organismal Biology at the Ohio State University, Columbus, Ohio (lkubatko (at) stat.ohio-state.edu) has written SSA (inference of maximum likelihood phylogenetic trees using a Stochastic Search Algorithm), version 1.0 , a program that uses a stochastic search to find maximum likelihood phylogenies. SSA is a program for inferring maximum likelihood phylogenies from DNA sequences. Two versions of the program are available: one which assumes a molecular clock and one which does not make this assumption. The method for searching the space of trees for the ML tree is based on a simulated-annealing type algorithm. The program implements the F84 model of nucleotide substitution and associated sub-models. It estimates the ML tree and branch lengths, and can optionally estimate the transversion/transversion ratio. Upon termination, the program returns the k trees of highest likelihood found during the search, where k can be set by the user. It is described in the paper: Salter, L. A. and D. K. Pearl. 2001. Stochastic search strategy for estimation of maximum likelihood phylogenetic trees. Systematic Biology 50(1): 7-17. It is available as executables for Windows, Linux, AIX, and SPARC systems. Laura is also willing to send the source code to users who own the book Numerical Recipes in C by Press, Teukolsky, Vetterling and Flannery, and who thus have permission to use routines from that book. The documentation and executables can be downloaded from its web site at http://www.stat.ohio-state.edu/~lkubatko/software/ssa/ssa.html


Nir Friedman, Matan Ninio, Tal Pupko, Eval Privman, and Itshak Pe'er of the Department of Computer Science and Engineering of Hebrew University, Jerusalem, Israel, and the Department of Cell Research and Immunology of Tel Aviv University, Israel (semphy (at) cs.huji.ac.il) have written SEMPHY (Structural EM PHYlogenetic reconstruction), version 2.0, Uses the structural EM algorithm to search for maximum likelihood phylogenies. The Structural EM algorithm is one proven to go uphill on the likelihood surface, and should gain in efficiency and adequacy of search of the likelihood surface compared to other likelihood algorithms. The program can use DNA or protein sequences, and can use a variety of DNA models and amino acid replacement models including the general reversible model and the HKY model (for DNA) and the JTT model (for protein sequences). It also allows for Gamma-distributed among-sites rate variation. SEMPHY also makes available an iterative distance matrix method which computes Bayesian posterior rates of change at individual sites, uses these to compute distances and find a neighbor-joining tree. The program and methods are described in the papers:

It is available as C++ source code, Windows executables, Linux executables and Powermac Mac OS X executables. It can be downloaded from its web site at http://compbio.cs.huji.ac.il/semphy/


Simon Whelan of the Faculty of Life Sciences at the University of Manchester, U.K. (simon.whelan (at) manchester.ac.uk) has released Leaphy (Likelihood Estimation Algorithms for PHYlogenetics), version 1.0beta, a fast and accurate program for maximum likelihood phylogenetic inference. Leaphy uses maximum likelihood to estimate trees from aligned amino acid and nucleotide sequences under a variety of commonly used and popular models. The methods for searching for the best tree topology are described in the paper: Whelan, S. 2007. New approaches to phylogenetic tree search and their application to large numbers of protein alignments. Systematic Biology 56: 727-740. It is available as Windows executables and Linux executables. It can be downloaded from its web site at http://www.bioinf.manchester.ac.uk/leaphy/Leaphy.htm


Daniele Catanzaro of the Computer Science Department of the Université Libre de Bruxelles (U.L.B.) (dacatanz (at) @ulb.ac.be) has released PhyloCoco version 1.0c, a molecular phylogeny package for Intel-iMac with OS 10.4.9 or higher and Java 1.4 or higher. PhyloCoco is a minimalist tool for rebuilding molecular phylogenies by means of the likelihood criterion. Phylococo selects the best substitution model of DNA evolution for the dataset of sequences to be analyzed and displays the best phylogeny found so far. It uses the GTR model of DNA evolution and uses different optimization methods including the Very Large Scale Neighborhood (VLSN) search for the topology and Iterated Local Search (ILS) to explore the solution space. PhyloCoCo uses FigTree to display the resulting phylogeny. It is described in the paper: Catanzaro, D., R. Pesenti and M. C. Milinkovitch. 2007. Estimating phylogenies under maximum likelihood: a very large-scale neighborhood approach. Submitted to BMC Bioinformatics. It is available as Java source code and Intel Mac OS X executables. It can be downloaded from its web site at http://homepages.ulb.ac.be/~dacatanz/Site/PhyloCoco.html


Bret Larget, of the Departments of Statistics and Botany at the University of Wisconsin, Madison (larget  (at) stat.wisc.edu) and Donald Simon of the Department of Mathematics and Computer Science, Duquesne University, Pittsburgh, Pennsylvania (simon  (at) mathcs.duq.edu) have written BAMBE (Bayesian Analysis in Molecular Biology and Evolution) version 2.03beta, a program for Bayesian analysis of phylogenies with DNA sequence data. It uses a prior distribution of trees and arearrangement mechanism introduced in the paper: Mau, B., M. A. Newton, and B. Larget. 1997. Bayesian phylogenetic inference via Markov chain Monte Carlo methods. Molecular Biology and Evolution 14: 717-724. The trees and parameter values are sampled by a Metropolis algorithm Markov Chain Monte Carlo sampling. The resulting posterior distribution can be used to characterize the uncertainty about not only the tree, but the parameters of the substitution model as well. The program is in C++ source code for Unix, and is distributed from its web page at http://www.mathcs.duq.edu/larget/bambe.html. It is also run as a web server at the Institut Pasteur in Paris.


Howsun Jow and Vivek Gowri-Shankar (vivek.gowri-shankar (at) s.man.ac.uk) of the Department of Computer Sciences of the University of Manchester, Manchester, U.K. have written PHASE, version 1.1, a software package for PHylogenetics And Sequence Evolution. It infers phylogenies with models for RNA evolution that include models for both paired sites and unpaired sites. The models for the unpaired sites have the usual 4 states, while the models for the paired sites have 6, 7, or 16 states, depending on the model chosen. The programs carry out a Bayesian Markov chain Monte Carlo (MCMC) analysis that samples trees from the posterior distribution given the data. PHASE is described in two papers:

It is available as C++ source code and Linux or Windows executables from its web page at http://www.bioinf.man.ac.uk/resources/phase/.


Mark Pagel and Andrew Meade of the School of Biological Sciences of the University of Reading, Reading, U.K. (m.pagel (at) reading.ac.uk) have written BayesPhylogenies, a program for estimating phylogenies by Bayesian inference. BayesPhylogenies uses Bayesian Markov Chain Monte Carlo (MCMC) or Metropolis-coupled Markov chain Monte Carlo (MCMCMC) methods. The program allows a range of models of gene sequence evolution, models for morphological traits, models for rooted trees, gamma and beta distributed rate-heterogeneity, and implements a mixture model that allows the user to fit more than one model of sequence evolution without partitioning the data. It is described in the paper: Pagel, M. and Meade, A. 2004. A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Systematic Biology 53: 571-581. It is available as Windows executables, Linux executables, and Powermac Mac OS X executables. It can be downloaded from its web site at http://www.evolution.rdg.ac.uk/BayesPhy.html


Nicolas Lartillot of the LIRMM (Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier) of the Université de Montpellier II, Montpellier, France (nicolas.lartillot (at) lirmm.fr) has written PhyloBayes version 2.1c, a Bayesian phylogeny package for protein sequences using a mixture model. PhyloBayes is a Bayesian Monte Carlo Markov Chain (MCMC) sampler for phylogenetic reconstruction using protein alignments. Compared to other phylogenetic MCMC samplers, the main distinguishing feature of PhyloBayes is the underlying probabilistic model, CAT. This is a mixture model especially devised to account for site-specific features of protein evolution. It is particularly well suited for large multigene alignments. PhyloBayes can also do divergence time estimation with a relaxed molecular clock, posterior predictive analyses, including a compositional homogeneity test, and data recoding (analogous to R/Y coding, but for amino-acids). The CAT model is described in the paper: Lartillot, N. and H. Phillipe. 2004. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular Biology and Evolution 21(6): 1095-1109. PhyloBayes is a package of programs that operate together to do the steps of the analysis. It is distributed as C++ source code and Linux executables. The C++ source code can be compiled on Linux, Windows, or Mac OS X systems. It can be downloaded from its web site at http://www.lirmm.fr/mab/sommaire_english.php3


Le Sy Vinh (vinh  (at)  cs.uni-duesseldorf.de) and Heiko Schmidt (heiko  (at)  cs.uni-duesseldorf.de) of the Institut für Bioinformatik of the University of Düsseldorf, Germany and Arndt von Haeseler (arndt.von.haeseler  (at)  univie.ac.at) of the Center for Integrative Bioinformatics Vienna (CIBIV), Austria, have written Phylogenetic Navigator (PhyNav) version 1.0. This program finds subsets of species in a dataset that are "minimal k-distance subsets" and analyses these each by maximum likelihood. Then it stitches these groups together using likelihood. This makes it possible to analyze larger datasets. The program is described in a paper: Vinh, L. S., H. A. Schmidt, and A. von Haeseler. 2005. PhyNav: A novel approach to reconstruct large phylogenies. pp. 386-393 in Classification, the Ubiquitous Challenge (Proceedings of the 28th Annual Conference of the GfKl 2004), ed. C. Weihs and W. Gaul. Series Studies in Classification, Data Analysis, and Knowledge Organization. Springer-Verlag, Heidelberg/New York. It is available as Linux executables from its web site at http://www.bi.uni-duesseldorf.de/software/phynav/


Paul Michael Agapow of the Department of Biology of Imperial College, Silwood Park, U.K. (p.agapow (at) ic.ac.uk) has written Mac5, version 1.7.3, a program for phylogenetic reconstruction using gapped data. MAC5 implements MCMC sampling to estimate a phylogenetic tree from a DNA multiple alignment. What differentiates MAC5 from similar programs is its use of five-state sequence evolution models as a means to include the gap information. It is available as C source code, Windows executables and Powermac Mac OS X executables. Its author says that owing to other projects, Mac5 is not being further developed and is not being supported by him. It can be downloaded from its web site at http://www.agapow.net/software/mac5


[Modeltest icon] David Posada (dposada  (at) uvigo.es) of the Department of Biochemistry, Genetics and Immunology of the University of Vigo, Spain and Keith Crandall of the Department of Biology, Brigham Young University released Modeltest version 3.6, a program to test a hierarchy of statistical models of DNA evolution using the Likelihood Ratio Test criterion and the AIC (Akaike Information Criterion). The likelihood values are obtained by running PAUP*. MODELTEST accepts likelihood scores corresponding to 56 models of DNA substitution including whether transition and transversion rates are equal, whether rates at different sites are equal, and whether there are invariant sites. Modeltest is described in the paper: Posada, D. and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14: 817-818. It is available as executables for Macintosh, for Windows, and source code in C for that can be compiled on many other systems. It is distributed from its web site at http://darwin.uvigo.es/software/modeltest.html. Modeltest was the basis for two further developments: the MrModeltest program which uses MrBayes and the FindModel server at Los Alamos National laboratories which is a revised version of Modeltest that uses the weighbor program to infer the trees.


Paulo Nuin (nuinp  (at) mcmaster.ca) of the Department of Biology, McMaster University, Hamilton, Ontario, Canada has released MrMTgui version 1.01. This is a graphic user interface for running Modeltest and MrModeltest. It is available for Windows as executables from the MrMTgui web site at http://genedrift.org/mtgui.php. Source code of a Linux version is also available which can be compiled using the WxWindows windowing software. The Linux sources are available by accessing a svn (subversion) version-control code base, using instructions available at the above site. MrMTgui was formerly known as MTgui in the earlier version which could not access MrModeltest.


Johan Nylander (Johan.Nylander  (at) abc.se) has released MrModeltest version 2.2. This is a program which is a simplified version of Modeltest 3.6. It is performs hierarchical likelihood ratio tests and calculates approximate AIC, AICc, and Akaike weights of the nucleotide substitution models currently implemented in both PAUP* and MrBayes. Version 2 has added use of four different hierarchies for the likelihood ratio tests and the selected model being printed in a MrBayes block. MrModeltest is available as an executable and source code for Windows, for Mac OS, and for Mac OS X, and as source code for Linux and Unix. It is available from Nylander's software download site at http://www.abc.se/~nylander/ in Sweden.


Johan Nylander (Johan.Nylander  (at) abc.se) has written Modelfit version 1.2, and MrModelfit version 1.2. These are Perl scripts that can run (respectively) Modeltest and MrModeltest simply by typing a single command line. They are available from Nylander's software download site at http://www.abc.se/~nylander/ in Sweden.


Charles Bell of the Department of Biology of Xavier University of Louisiana, New Orleans (cbell3  (at) xula.edu) has written Porn* (Phylogenetics On Rick's Network, as it was originally hosted on Rick Ree's site) verson 2.0, a Linux clone of Modeltest using the Python language. It enables command-line computations equivalent to Modeltest under the Linux operating system. It creates command blocks for PAUP* which can be used when running PAUP*. Porn* is written as a shell script invoking Python modules. It is available at its web site at http://www.phylodiversity.net/cbell/pornstar/


[ProtTest icon]   David Posada (dposada  (at) uvigo.es) of the Department of Biochemistry, Genetics and Immunology of the University of Vigo, Spain has released ProtTest, version 1.2.6, a Java program allowing testing of 64 different models of protein evolution, using the AIC, AICc, and BIC criteria for choosing among models that include different substitution models, invariant sites, rate heterogeneity, and empirical amino acid frequency variants of the models. ProtTest uses the PAL library of phylogenetic java routines and also uses the PHYML program to compute likelihoods. It is described in the paper: Abascal, F., R. Zardoya and D. Posada. 2005. ProtTest: Selection of best-fit models of protein evolution. Bioinformatics 21: 2104-2105. It is available from its web site at http://darwin.uvigo.es/software/prottest.html


Thomas Keane, of the Bioinformatics and Pharmacogenomics Lab of the Department of Biology, National University of Ireland, Maynooth (thomas.m.keane (at) nuim.ie) has written ModelGenerator, version 0.84. It is a Java program for model selection that selects amino acid and nucleotide substitution models using Fasta or PHYLIP alignments. It supports 56 nucleotide and 80 amino acid substitution models. It is described in the paper: Keane, T. M., C. J. Creevey, M. M. Pentony, T. J. Naughton and J. O. McInerney. 2006, Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evolutionary Biology 6: 29. It is available from its web site at http://bioinf.may.ie/software/modelgenerator/.


Johan Nylander (Johan.Nylander  (at) abc.se) has written MrAIC verion 1.4. This is a Perl script that carries out AIC, AICc, BIC, and Akaike weights model comparison methods for nucleotide substitution models by invoking the PHYML program. It is distributed from Nylander's software download site at http://www.abc.se/~nylander/ in Sweden.


Vladimir Minin, Zaid Abdo, Paul Joyce, and Jack Sullivan of the Department of Biological Sciences at the University of Idaho, Moscow, Idaho (jacks (at) uidaho.edu) or (vminin (at) ucla.edu) have released DT-ModSel (Decision Theory MODel SELection), a performance-based method for selecting a likelihood model for phylogenetic estimation . It implements a model selection method which is based on the Bayesian Information Criterion, but incorporates relative branch-length error as a performance measure in a decision theory (DT) framework. This DT method includes a penalty for overfitting, is applicable prior to running extensive analyses, and simultaneously compares all models being considered and thus does not rely on a series of pairwise comparisons of models to traverse model space. It can compare 56 different models of molecular sequence evolution on a given tree. It is described in the paper: Minin, V., Z. Abdo, P. Joyce, and J. Sullivan. 2003. Performance-based selection of likelihood models for phylogeny estimation. Systematic Biology 52: 674-683. It is available as Perl script. It can be downloaded from its web site at http://www.webpages.uidaho.edu/~jacks/DTModSel.html


[HyPhy icon] Sergei Kosakovsky Pond and Simon Frost of the Anitviral Research Center, University of California, San Diego and Spencer Muse of the Department of Statistics, North Carolina State University, Raleigh, North Carolina (muse  (at) stat.ncsu.edu) have released HY-PHY (HYpothesis testing using PHYlogenies), version 0.99Beta. HY-PHY has general ways of enabling the user to perform a wide variety of statistical tests of different models of molecular sequence change. It is actually a higher-level programming language which enables the user to set up many different kinds of tests. The user can define their own alphabet of symbols and test any reversible subtitution model. Examples of tests that can be performed include molecular clock tests, relative rate tests, relative ratio tests, and tests of positive selection. It is described in a paper: Kosakovsky Pond, S. L., S. D. Frost, and S. V. Muse. 2004. HyPhy: hypothesis testing using phylogenies. Bioinformatics 27 October (e-publication ahead of print).

Although not primarily intended as a phylogeny estimation package, it also can infer trees by Neighbor-Joining and UPGMA methods, and a number of search strategies are also available for likelihood inference. HY-PHY is freely available as executables for MacOS, for MacOS X, for Windows, and as source code for for Unix and Linux. It is available at the HY-PHY web page at http://www.hyphy.org.


[Kakusan2 icon]  Akifumi S. Tanabe of the Division of Ecology and Evolutionary Biology of the Graduate School of Life Sciences of Tohoku University, Japan (astanabe (at) mail.tains.tohoku.ac.jp) has released Kakusan2 version 2.0.2007.07.05, a nucleotide substitution model selection script written in the Perl language for multi-partitioned data sets for multilocus data. It can separate loci and codon positions into different data partitions, and use PAUP* to select nucleotide substitution models for these different partitions. It also contains some executables of PAUP*, PAML, and PHYLIP, used by permission of those programs' authors. It calculates the AIC, AICc, and BIC model selection criteria from the results. It is described in the paper: Tanabe, A. S., 2007. Kakusan: a computer program to automate the selection of a nucleotide substitution model and the configuration of a mixed model on multilocus data Molecular Ecology Notes, early online publication doi:10.1111/j.1471-8286.2007.01807.x    It is available as C source code, Perl script, Windows executables and Mac OS X universal executables. It can be downloaded from its web site at http://www.fifthdimension.jp/products/kakusan/


[MAPPS icon]  Jonathan Bollback of the Bioinformatics Centre of the Institute of Molecular Biology and Physiology (IMBP) at the University of Copenhagen, Denmark (bollback  (at) binf.ku.dk) has written MAPPS (Model Adequacy in Phylogenetics by Predictive Simulation) version 1.1.6, a program to evaluate the fit of a group of phylogenetic models to DNA sequence data. The rationale behind this approach is that an adequate model should be able to predict future data (nucleotide site patterns). In the absence of future data the model's predictive ability is compared to the original data set. The model's predictive ability is evaluated through simulation under the model. Comparison of simulated (or predictive) data sets is evaluated using the multinomial test statistic. The program uses data and trees in a format compatible with the output from MrBayes. It is described in the paper: Bollback, J. P. 2002. Bayesian model adequacy and choice in phylogenetics. Molecular Biology and Evolution 19(7): 1171-1180. It is available as Powermac Mac OS X executables. It can be downloaded from its web site at http://www.binf.ku.dk/~bollback/software.html


Hidetoshi Shimodaira ("Shimo") of the Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Japan (shimo  (at) is.titech.ac.jp) has released CONSEL version 0.1h, a package of small programs to calculate P values for tests of phylogenies. It uses output from other phylogeny programs (in particular it can use output from PAUP, PAML, and MOLPHY) which makes available to it the sitewise log-likelihoods for some trees and the trees themselves. It uses these to carry out the Kishino-Hasegawa test, the Shimodaira-Hasegawa test, a weighted version of the SH test, and a new "approximately unbiased" test of Shimodaira's. CONSEL is available as C source code that will compile on Linux and Unix systems that have the gcc compiler, and it is also available as a DOS executable that will run on DOS or Windows systems. It can be downloaded from its web site at http://www.ism.ac.jp/~shimo/prog/consel/index.html. It is described in a paper: Shimodaira, H. and M. Hasegawa. 2001. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17: 1246-1247 which cites the statistical papers describing the methods.


Hidetoshi Shimodaira ("Shimo") of the Department of Mathematical and Computing Sciences of the Tokyo Institute of Technology, Ookayama, Meguroku, Tokyo, Japan ( shimo (at) is.titech.ac.jp) has written scaleboot (approximately unbiased P-values via multiscale bootstrap), version 0.2-2, an R package for making approximately unbiased P values for tree topologies. savelboot implements Shimodaira's Approximately Unbiased method of putting P values on regions of parameter space, including tree topologies. The P-values are computed from a set of multiscale bootstrap probabilities , computed by sampling different fractions of the characters. The multiscale bootstrap method has also been implemented in the program CONSEL as well. scaleboot has an interface for the pvclust clustering package in R. It also has a front end for phylogenetic inference, and it can replace the CONSEL program for testing phylogenies. Currently, scaleboot does not have a method for file conversion from other phylogeny packages, so we must use CONSEL for this purpose before applying scaleboot to calculate an improved version of AU p-values for trees and branches. The methods are described in the papers:

It is available as an R package. It can be downloaded from its web site at http://www.is.titech.ac.jp/~shimo/prog/scaleboot/index.html


Maria Anisimova, Olivier Gascuel, and Jean-François Dufayard of the Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM) of the Université de Montpellier II, Montpellier, France (manisimova (at) hotmail.com) have produced aLRT (approximate Likelihood Ratio Test), version 1.1, a program to carry out likelihood ratio tests of the presence of branches in a phylogeny. aLRT is a modification of the original PHYML program, and is designed to compute test of the reality of branches in a known phylogeny. Five branch support tests are available: (1) the bootstrap, (2) aLRT statistics, (3) aLRT parametric (Chi2-based) branch support, (4) aLRT non-parametric branch support based on a Shimodaira-Hasegawa-like procedure, and (5) a combination of these two latters supports, that is, the minimum value of both. The methods are described in the paper: Anisimova, M., and O. Gascuel. 2006. Approximate likelihood ratio test for branchs: A fast, accurate and powerful alternative. Systematic Biology 55(4): 539-552. It is available as Windows executables, Linux executables, Solaris executables, Powermac Mac OS X executables and Intel Mac OS X executables. It can be downloaded from its web site at http://atgc.lirmm.fr/alrt/ This program is temporary; the method will ultimately be availabls in PHYML.


[PLATO icon here] Nick Grassly, of the Department of Infectious Disease Epidemiology of Imperial College School of Medicine, St. Mary's Campus, London (n.grassly  (at) ic.ac.uk) has written PLATO, version 2.11, (Partial Likelihoods Assessed Through Optimisation), a program that takes sequential PHYLIP-style DNA sequences followed by their maximum likelihood phylogeny, and using a likelihood approach with sliding window analysis and Monte Carlo simulation of the null distribution detects anomalously evolving regions in the DNA sequences and assesses their significance. This may lead to the detection of, for example, recombination, gene conversion or convergence, or reveal variable selective pressures along the gene sequence. A general substitution model is used that can allow the test to reveal differences due to recombination while ignoring those due to varying rate of evolution. The method is described in the paper: Grassly, N. C., and E. C. Holmes. 1997. A likelihood method for the detection of selection and recombination using sequence data. Molecular Biology and Evolution 14: 239-247. It is available as a Mac OS Macintosh binary executable, or in source code for Unix systems. It is distributed free from the University of Oxford Zoology Web server at http://evolve.zoo.ox.ac.uk/software.html?id=plato.


[Topali icon]   Iain Milne, Dominik Lindner, and Frank Wright of Biomathematics and Statistics Scotland at the Scottish Crop Research Institute, Invergowrie, Dundee, Scotland (help (at) topali.org) have released TOPALi version 2, a program for statistical and evolutionary analysis of multiple sequence alignments. It checks for evidence of past recombination events by looking for changes in the inferred phylogenetic tree TOPology between adjacent regions of a multiple sequence ALignment. Their method detects recombinations by sliding a window along a sequence alignment, and measuring the discrepancy between the trees suggested by the first and second halves of the window, using distance matrix methods. Version 2 includes further statistical tests for recombination based on nonparametric bootstrapping and allowing for rate heterogeneity between sites. It can also launch a range of statistical and evolutionary analyses of multiple sequence alignments as web services running (either locally on your PC) or on the HPC cluster in Dundee. These include phylogenetic model selection (via ModelGenerator), Bayesian and maximum Likelihood phylogenetic tree estimation (via PHYML and MrBayes), detection of sites under positive selection (using PAML), and the recombination breakpoint location analysis methods. Version 1 of TOPALi is described in the paper: Milne, I., F. Wright, G. Rowe, D. F. Marshal, D. Husmeier, and G. McGuire. 2004. TOPALi: Software for automatic identification of recombinant sequences within DNA multiple alignments. Bioinformatics 20 (11): 1806-1807. It is available as Java source code, Java executables, Windows executables and Linux executables. They can be downloaded from its web site at http://www.topali.org. Version 1 of TOPALi has been superseded by version 2 but is also available, at the version 1 web page at http://www.topali.org/topali-v1/


Ingrid Jakobsen (currently of the Advanced Computational Modelling Centre, University of Queensland, Australia, ibj  (at) maths.uq.edu.au and Simon Easteal of Australian National University, Canberra, have released reticulate. It is a compatibility matrix program for DNA sequences that has features designed to test for evidence of reticulate evolution (such as recombination). The program computes and displays a pairwise compatibility matrix for all pairs of sites. It can randomize the order of sites and compute the fraction of compatible sites in a region for the randomizations, to test whether there is a pattern suggesting reticulation. The program is distributed as C source code for Unix and X Windows, though there are some limited ways of running it without X Windows. It is described in the paper: Jakobsen, I. B. and S. Easteal. 1996. A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences. CABIOS 12: 291-295. It is available from Ingrid Jakobsen's software web site at http://acmc.uq.edu.au/DETYA/people/ibj/Retic/.


Kim Fisker, then of the Computer Science Department at Aarhus University, Denmark released RecPars, which does a parsimony analysis of DNA sequences. It was more recently maintained by Thomas Christensen of that department. It tries to find the best phylogenies for different regions of the sequences and thereby postulating a recombination event between these segments. The method is described in a paper: Hein, J. 1993. A heuristic method to reconstruct the history of sequences subject to recombination. Journal of Molecular Evolution 36: 396-406. RecPars is available as C source code for Unix. It is distributed from its web site at http://www.daimi.au.dk/~compbio/recpars/recpars.html. A web server is available there as well.


[LARD icon here] Andrew Rambaut of the Department of Zoology, University of Oxford, England (andrew.rambaut  (at) zoo.ox.ac.uk) has produced LARD (Likelihood Analysis of Recombination in DNA) version 2.2, a program to detect the presence of recombination in a set of sequences. LARD looks at the set of sequences to discover which are the most plausible parents of a potentially recombinant sequence, and performs a likelihood ratio test for each possible breakpoint position of whether the three-species tree differs on the two sides of the breakpoint. LARD is described as an extension of a method suggested by John Maynard Smith: Maynard Smith, J. 1992. Analysing the mosaic structure of genes. Journal of Molecular Evolution 34: 126-129. It is described in a paper: Holmes, E. C., M. Worobey, and A. Rambaut. 1999. Phylogenetic evidence for recombination in dengue virus. Molecular Biology and Evolution 16: 405-409. LARD is available as C source code and as a Macintosh executable from its web site at http://evolve.zoo.ox.ac.uk/software.html?name=Lard.


Adrian Gibbs (Adrian.Gibbs  (at) anu.edu.au) of the Department of Botany and Zoology of the Australian National University, Canberra, has written SiScan, version 2.0, a program that scans 3 or 4 DNA sequences for evidence of recombination. Two of the sequences are the putative parent sequences, one the putative recombinant, and one an outgroup. The program uses a Monte Carlo randomization procedure to test for recombination signal. The program can be downloaded as an archived Windows executable (that's what I assume it is, the web site doesn't say) from the department software distribution web site at http://www.anu.edu.au/BoZo/software/index.html. SiScan is described in a paper: Gibbs, M. J., J. S. Armstrong, and A. J. Gibbs. 2000. Sister-Scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16: 573-582.


Jonathan Moore and Robin Allaby of the Warwick HRI at the University of Warwick, UK (jonathan.moore (at) warwick.ac.uk) has released TreeMos version 1.0, a package for search and visualisation of phylogenetic mosaicism, which identifies anomalous nearest-neighbour relationships of segments in multiple multiple alignments. It allows the user to search for phylogenetic mosaicism in a group of DNA or protein sequence multiple alignments or genome sequences. TreeMos uses a sliding window and local alignment and tree-building algorithms (ClustalW) to identify sequence segments whose nearest neighbour is anomalous to that identified using the whole alignment, and visualizes that relationship where the anomalous neighbour may come from anywhere in the data set. TreeMos can import a group of alignments in FASTA format, identify instances of phylogentic mosaicism within and between alignments, and display graphical representations of the results in a web browser. The methods are described in the paper: Allaby, R.G. and M. Woodwark. 2007. Phylogenetic analysis reveals extensive phylogenetic mosaicism in the Human GPCR superfamily. Evolutionary Bioinformatics 3: 155-168. It is available as a Perl script and Intel Mac OS X executables. It can be run as a command line program, but also requires a local Apache installation for its GUI functionc. TreeMos can be downloaded from its web site at http://www2.warwick.ac.uk/fac/sci/whri/research/archaeobotany/downloads/


Dan Gusfield (gusfield  (at) cs.ucdavis.edu) and Ren-Hua Chung (rchung  (at) ucdaavis.edu), both of the Department of Computer Science at the University of California, Davis, have released PPH (Perfect Phylogeny Haplotyper). PPH takes a set of diploid genotypes for SNP (single nucleotide polymorphism) markers, and infers haplotypes for them. It does this by seeing whether it can find a set of haplotypes that resolve all diploid genotypes and that fit onto a tree without requiring any extra changes of nucleotides (in other words, they are all compatible with the same tree). The result is not only the haplotype resolution but the resulting tree, if any. The method is described in a paper: Gusfield, D., 2002 Haplotyping as perfect phylogeny: conceptual framework and efficient solutions, pp. 165-175 in Proceedings of RECOMB 2002, edited by G. Myers, S. Hannenhalli, D. Sankoff, S. Istrail, P. Pevzner et al. ACM Press, New York. The program is available as C++ and Perl source code, and as executables for Windows, for SUN SPARC Solaris, for Intel/AMD-compatible Linux, and for Mac OS X from its web site at http://wwwcsif.cs.ucdavis.edu/~gusfield/pph.html.


[PIST icon] Michael Worobey, of the Department of Ecology and Evolution, University of Arizona, Tucson, Arizona (worobey  (at) email.arizona.edu) and Andrew Rambaut, of the Department of Zoology, University of Oxford (andrew.rambaut  (at) zoo.ox.ac.uk) have written PIST (Phylogenetic Informative Sites Test) version 1.0, a program to perform this test. The program simulates multiple data sets up a given tree, and then computes, for each of these and for an original data set, a statistic which is the proportion of sites that have two states and fit the tree perfectly. This statistic will be inflated in the original data if there are recombination events in its genealogy. The program is available as a Mac OS executable and as source code for Unix (which can also be compiled on Windows or on Mac OS X). It is distributed from its web page at http://evolve.zoo.ox.ac.uk/software.html?id=pist


Marc Suchard and Vladimir Minin of the Department of Biomathematics at the University of California, Los Angeles (msuchard (at) ucla.edu) have released DualBrothers version 1.1 , recombination detection software based on the dual Multiple Change-Point (MCP) model. . The model allows for changes in topology and evolutionary rates across sites in a multiple sequence alignment. It uses a Bayesian approach together with an MCMC (Markov chain Monte Carlo) sampling to simulate from the posterior distribution of the dual MCP model parameters. It is described in the papers:

It is available as Java code which needs the user to also download the Colt scientific library for Java. It can be downloaded from its web site at http://www.biomath.ucla.edu/msuchard/DualBrothers/


Karin Dorman of the Department of Genetics, Development and Cell Biology of Iowa State University, Ames, Iowa (kdorman (at) iastate.edu) has written cBrother, a C version of the DualBrothers program, with extensions. cBrother is a C version of the Java code of DualBrothers, developed by Suchard et al. as a Bayesian multiple change point model to test for the presence of rare recombination events in the history of a set of sampled sequences. It is available as C source code. It can be downloaded from its web site at http://rumi.zool.iastate.edu/software/index.xml


Simone Linz, Achim Radtke, and Arndt von Haeseler of the Center of Integrative BioInformatics Vienna of the University of Vienna, Austria (jarndt.von.haeseler (at) univie.ac.at   and  linz (at) cs.uni-duesseldorf.de) have written HGT (Horizontal Gene Transfer), a program to test for the presence of horizontal gene transfer. HGT considers the distribution of trees obtained from a set of different genes, and then simulates the trees obtained with a single species tree and different rates of horizontal gene transfer. The estimation of the rate of horizontal gene transfer is made based on the extent of differences among individual gene trees in the simulation and in the observed set of loci. The methods are described in the paper: Linz, S., A. Radtke, and A. von Haeseler. 2007. A Likelihood framework to measure horizontal gene transfer. Molecular Biology and Evolution 24: 1312-1319. HGT is available as C source code. It can be downloaded from its web site at http://www.cibiv.at/software/hgt/


Robert Beiko and Nicholas Hamilton of the Institute for Molecular Bioscience at the University of Queensland, Australia (beiko (at) cs.dal.ca) have released EEEP (Efficient Evaluation of Edit Paths), version 1.0, a program for inference of lateral genetic transfer by comparison of phylogenetic trees. EEEP performs subtree prune-and-regraft (SPR) operations on a rooted reference tree to reconcile it with a user-supplied tree inferred from data. The rooting of the reference tree is used to constrain the SPR operations that are allowed. The test tree need not be rooted or binary, and may contain an incomplete subset of the taxa represented in the reference tree. EEEP has been successfully compiled under RedHat Linux and AIX, as well as in Mac OS X and Windows XP. It is described in the paper: Beiko, R.G., and N. Hamilton. 2006. Phylogenetic identification of lateral genetic transfer events. BMC Evolutionary Biology 6: 15, in which it was used to infer LGT events on 16,000 genes. It is available as C++ source code, Windows executables and Linux executables. It can be downloaded from its web site at http://bioinformatics.org.au/eeep


Gary Olsen of the Department of Microbiology, University of Illinois, Urbana, Illinois (gary  (at) phylo.life.uiuc.edu) has written dnarates version 1.1.0. It reads a set of DNA sequences and a tree, and for that tree makes a maximum likelihood estimate of the rate of evolution at each site. This is done by taking the rate at each site as a separate parameter and maximizing the likelihood with respect to all those parameters. The program is available as generic C source code. It is based in part (with my permission) on code from my PHYLIP program DNAML. dnarates is available from its web page at http://geta.life.uiuc.edu/~gary/programs/DNArates.html (there links to an ftp area there).


Bette Korber of the Theoretical Division, Los Alamos National Laboratory , Los Alamos, New Mexico (btk  (at) t10.lanl.gov) and her colleagues have released RevDNArates which is a version of Gary Olsen's program dnarates which uses the REV (general reversible) model of DNA evolution and calculates the maximum likelihood estimate of rate of change at each site (one parameter per site). They used it for the results in the paper: B. Korber, M. Muldoon, J. Theiler, F. Gao, R. Gupta, A. Lapedes, B. H. Hahn, S. Wolinksy and T. Bhattacharya. 2000. Timing the ancestor of the HIV-1 pandemic strains. Science 288: 1789-1796. The program is available as C source code for Unix from the web site for the programs from that paper at http://www.santafe.edu/~btk/science-paper/bette.html.


Sonja Meyer and Arndt von Haeseler, then of the Insititut für Bioinformatik, Heinrich Heine Universität, Düsseldorf, Germany (von Haeseler is now at the Center for Integrative Bioinformatics Vienna, and his email address is arndt.von.haeseler (at) &nbps;univie.ac.at) have released PARAT, version 0.9.1. This program infers a phylogeny and also site-specific evolutionary rates (one for each site). It can do so for up to 100 sequences directly. Above 100 sequences, it samples sets of sequences and estimates the rates from each such set, and then averages the resulting rates. It is distributed as open source C source code, which can readily be compiled and installed. PARAT is decscribed in a paper: Meyer, S. and A. von Haeseler. 2003. Identifying site specific substitution rates. Molecular Biology and Evolution 20: 182-189. It is available at its web site at http://www.cibiv.at/software/parat/


Itay Mayrose of the Department of Cell Research and Immunology of the George S. Wise Faculty of Life Sciences, Tel Aviv University, Israel (itaymay  (at) post.tau.ac.il ) has written Rate4Site version 2.01, a program to estimate rates of evolution at different sites in protein sequences. Rate4Site uses aligned protein sequences, constructs a tree by a neighbor-joining or uses a user-defined input tree, and then infers the branch lengths and the rates of evolution at the sites. These are assumed to be drawn from a Gamma distribution and can be estimated either by maximizing the likelihood of the tree with respect to each of the rates, or by using a Bayesian inference with the Gamma distribution as the prior (the parameters of the Gamma distribution are estimated empirically so that this is an Emprical Bayes method). The methods are described in the paper: Mayrose, I., D. Graur, N. Ben-Tal and T. Pupko. 2004. Comparison of site-specific rate-inference methods: Bayesian methods are superior. Molecular Biology and Evoution 21: 1781-1791. It is available as C++ source code and Windows executables. It can be downloaded from its web site at http://www.tau.ac.il/~itaymay/cp/rate4site.html


Itay Mayrose and Tal Pupko of the Department of Cell Research and Immunology of Tel Aviv University, Tel Aviv, Israel (itaymay (at) post.tau.ac.il) have produced McRate (Markov Chain monte carlo RATE estimation), version 1.0, a program to estimate rates of evolution at different sites. McRate calculates the relative evolutionary rate at each site using a probabilistic-based evolutionary model. This allows taking into account the stochastic process underlying sequence evolution within protein families. Most importantly, McRate uses Bayesian Markov chain Monte Carlo (MCMC) methodology to integrate over the space of all possible trees. Hence, McRate does not assume a pre-existing phylogenetic tree under which the sequences relate. McRate is described as superior to methods that rely on a single tree only. Its methods and the program are described in the papers:

It is available as C++ source code and Windows executables. It can be downloaded from its web site at http://www.tau.ac.il/~talp/MCMC/McRate.html


Jessica Leigh, Ed Susko, Manuela Bumgartner, and Andrew Roger of the Department of Biochemistry and Molecular Biology and the Department of Mathematics and Statistics of Dalhousie University, Halifax, Nova Scotia, Canada (jleigh (at) dal.ca) have written Concaterpillar version 1.2, a program that carries out a hierarchical likelihood ratio test for phylogenetic congruence. It tests for two kinds of hypotheses in supermatrix analysis. The first is the null hypothesis (H0) that the phylogenies of markers in the supermatrix are congruent. If we cannot reject congruence for a set of markers, the second hypothesis to test is whether or not the markers to be combined have significantly different evolutionary dynamics (branch lengths and rates-across-sites parameters); that is, whether they should be concatenated or subjected to separate analysis. The methods are described in the paper: Leigh, J. W., E. Susko, M. Baumgartner, Roger AJ. 2008. Assessing congruence in phylogenomic data. Systematic Biology 57: 104-115. It is available as Python script. It uses the program RAxML to infer trees, and the SciPy Python library as well. It can be downloaded from its web site at http://rogerlab.biochemistryandmolecularbiology.dal.ca/Software/Software.htm#Concaterpillar


Haichun Wang, Matthew Spencer, Ed Susko, and Andrew Roger of the Department of Mathematics and Statistics and of the Department of Biochemistry and Molecular Biology of Dalhousie University, Halifax, Nova Scotia, Canada (hcwang (at) mathstat.da.ca) have produced PROCOV (PROtein COVarion analysis), version 1.3.2, a program for aximum likelihood estimation of phylogeny under protein covarion models. PROCOV computes the likelihood of a given tree under the rates-across-sites model or under the covarion-like model of Tuffley and Steel, the model of Huelsenbeck, and the model of Galtier, as well as for a general model that combines features of both the Huelsenbeck and Galtier models. Procov can also optimize tree topologies with subtree pruning-regrafting to search tree space. Procov is very computationally slow, so this is most useful for small trees. It is described in the paper: Wang, H-C, M. Spencer, E. Susko, and A. J. Roger. 2007. Testing for covarion-like evolution in protein sequences. Molecular Biology and Evolution 24: 294-305. It is available as C source code. The authors suggest using the BLAS matrix library when compiling it. It can be downloaded from its web site at http://www.mathstat.dal.ca/~hcwang/procov.html


  Nick Goldman (goldman  (at) ebi.ac.uk) of the European Bioinformatics Institute, Hinxton, UK and his group have produced EDIBLE, a program for Experimental Design and Information By Likelihood Exploration, version 1.00. It allows the user to read in a phylogeny, explore the effect on the likelihood and on the information matrix (the second derivatives of the likelihood with respect to the parameters) and measures of overall information of changing branch lengths in the tree and moving branch lengths around. It also can carry out simulations, producing multiple data sets on the tree in question. The program is described in two papers:

The program is available as C source code and as Windows and Digital Unix executables. It can be downloaded from its web site at http://www.ebi.ac.uk/goldman/info/edible.html at the EBI site.


  John Huelsenbeck (johnh  (at) berkeley.edu) of the the Department of Integrative Biology of the University of California, Berkeley, and Fredrik Ronquist (Fredrik.Ronquist  (at) nrm.se) of the Naturhistoriska riksmuseet, Stockholm, Sweden have written MrBayes, version 3.1.2, a program for Bayesian inference of phylogenies from nucleic acid sequences, protein sequences, and morphological characters. It assumes a prior distribution of tree topologies and uses Markov Chain Monte Carlo (MCMC) methods to search tree space and infer the posterior distribution of topologies. It reads sequence data in the NEXUS file format, and outputs posterior distribution estimates of trees and parameters. It can also use a hierarchical Bayesian framework to infer sites that are under natural selection. It allows for rate variation among sites and a variety of models of sequence evolution. MrBayes is available as a Macintosh (PowerMac) executable, a Windows executable, or as source code in C. It allows for multiple-chain Metropolis-coupled Markov Chain Monte Carlo (MC3) runs for more extensive search, and can be asked to spread jobs over a cluster of computers using the MPI message-passing interface. (Incidentally, since Bayes was Reverend Bayes, shouldn't it be named RevBayes?). MrBayes executables, source code, and documentation are available from the MrBayes web page at http://mrbayes.net.


Torsten Eriksson of the Bergius Botanical Garden, Stockholm, Sweden (torsten  (at) bergianska.se) makes available MrBayes tree scanners. These are two Perl scripts that scan the output parameter files produced by MrBayes. One saves the tree corresponding to the best sample. The other saves all trees that contain a specific node (a specific grouping). They are distributed together, and available from his software distribution site at http://www.bergianska.se/index_forskning_soft.html.


Marc Suchard of the Department of Biomathematics of the David Geffen School of Medicine at UCLA, Los Angeles (msuchard (at) ucla.edu) has written MrBayesPlugin, a Java plugin module enabling Geneious to run MrBayes. With it, Geneious v2.5.4 (or above) is enabled perform and analyze simple Bayesian phylogenetic reconstruction using MrBayes. It is available as Java executables. It can be downloaded from its web site at http://www.biomath.ucla.edu/msuchard/software/software.htm


  Alexei Drummond, of the Department of Computer Science of the University of Auckland, New Zealand (alexei (at) cs.auckland.ac.nz) and Andrew Rambaut (a.rambaut (at) ed.ac.uk)), of the Institute for Evolutionary Biology, University of Edinburgh, Scotland, and formerly of the Department of Zoology, University of Oxford, Oxford, U.K., have developed BEAST (Bayesian Evolutionary Analysis Sampling Trees), version 1.4.1. This is a general Bayesian inference program for parameters of evolutionary models when the trees are coalescent trees. A variety of nucleotide substitution models including relaxed molecular clocks are allowed, and population models that include exponential population growth and divergence time between populations are included. Most of the analyses use Bayesian sampling to infer parameters by averaging over the posterior on the trees. For the purposes of this listing, the two relevant features are the ability to output a sample of the trees, so that the program can be used for Bayesian tree inference in clocklike models, and the ability to infer the divergence time between populations. The general approach used by BEAST is described in the paper: Drummond, A. J., G. K. Nicholls, A. G. Rodrigo, and W. Solomon. 2002. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161: 1307-1320. BEAST is available as a Java executable which will run on any system with Java 1.4 or later. There are specific packages available for Mac OS X and for Windows as well as the general distribution. These are all distributed from its web site at http://beast.bio.ed.ac.uk/Main_Page


 Alexei Drummond, of the Department of Computer Science of the University of Auckland, New Zealand (alexei (at) cs.auckland.ac.nz) and Andrew Rambaut (andrew.rambaut (at) zoo.ox.ac.uk)), of the Department of Zoology, University of Oxford, Oxford, U.K., have released Tracer, version 1.2. This is a program for analyzing the results of Bayesian sampling runs using either BEAST or MrBayes. It allows analysis of the progress of sampling the parameters. For the purposes of this listing, the relevant feature is an ability to use the trees sampled by these programs to do a Bayesian skyline plot analysis of birth and death rates of lineages. Tracer is available as a Java executable from its web site at http://evolve.zoo.ox.ac.uk/software.html?id=tracer with specific packages for Mac OS X and Windows as well.


Pavel Morozov and Andrey Rzhetsky of the Department of Biomedical Informatics and the Columbia Genome Center of Columbia University, New York, New York (pm259 (at) columbia.edu and andrey.rzhetsky (at) dbmi.columbia.edu) have released PHYLLAB version 1.1, A toolbox for sequence manipulation and phylogenetic analysis in MatLab. PHYLLAB takes as input a set of aligned nucleotide or amino-acid sequences, and performs phylogeny inference. Beside traditional phylogenetic methods it uses a Markov chain Monte Carlo method, evaluating the posterior distribution over tree topologies and a variety of model parameters, including parameters of substitution-rate variation under a wavelet model. The graphical interface helps users to manage input data and to visualize the most likely trees; they can also view substitution-rate plots that show the maximum posterior density (confidence) intervals. It is written in the MatLab language, and interested users can extend it easily. The PHYLLAB toolbox is continually expanding, and the authors expect to offer many more functions and scripts for different purposes soon. It is available as a MATLAB package. It can be downloaded from its web site at http://amdec-bioinfo.cu-genome.org/html/misc/Pavel/phyllab.html


Peter Foster (p.foster  (at) nhm.ac.uk) of the Natural History Museum, London, England has released p4 version 0.81, a Python package for maximum likelihood and Bayesian phylogenetic analyses of molecular sequences. This is not a program with menus and buttons; it is invoked using the Python language, which the user should know before attempting to use it. It needs Python 2.3 or better and the Gnu Scientific Library (GSL) installed on the machine. It is distributed as Python source code at its web site at http://www.nhm.ac.uk/zoology/external/p4.htm


[Spectrum icon here] Mike Charleston (mcharles  (at) it.usyd.edu.au) of the Sydney University Biological Informatics and Technology Centre, Sydney, Australia has developed Spectrum, a program for finding bipartition spectra from phylogenetic molecular and distance data, according to the method of Hendy et al. (1994) (Hadamard transforms) for moderately sized data sets (up to 18 taxa). The program also implements a branch-and-bound search for the "closest tree" - that is, the tree whose expected spectrum is closest to the spectrum derived from the observed data. Mac OS PowerMac, 68k Mac OS, and Windows executables are available from its Web site in the Glasgow Taxonomy web pages: http://taxonomy.zoology.gla.ac.uk/~mac/spectrum/spectrum.html.


Rasmus Nielsen, of the Bionformatics Centre at the University of Copenhagen, Denmark (rasmus (at) binf.ku.dk) has written MDIV, a program that will simultaneously estimate the divergence time and the migration rates between two populations. It can use either an infinite-sites model or an HKY sequence evolution model. It can test whether the evidence supports historical divergence, migration between the populations, or both, and make maximum likelihood estimates and likelihood surfaces for the parameters. It assumes equal population sizes in the two populations and in their ancestors. It is decsribed in a paper: Nielsen, R. and J. W. Wakeley. 2001. Distinguishing migration from isolation: an MCMC approach. Genetics 158: 885-896. It is distributed as a Windows executable from Nielsen's programs web site at http://www.binf.ku.dk/users/rasmus/webpage/programs.html#MDIV


Ingrid Jakobsen, Susan Wilson, and Simon Easteal, of Australian National University, Canberra, released partimatrix. (Ingrid Jakobsen is currently at the Advanced Computational Modelling Centre, University of Queensland, Australia, ibj  (at) maths.uq.edu.au). This program computes a "partition matrix" from aligned DNA sequence data. The method finds partitions of the sequences into two groups and presents a matrix which describes the conflict and agreement among these partitions. The objective is to discover parts of the DNA sequence which imply different trees. It is described in the paper by I. B. Jakobsen, S. R. Wilson and S. Easteal. 1997. The Partition Matrix: Exploring variable phylogenetic signals along nucleotide sequence alignments. Molecular Biology and Evolution 14: 474-484. The program is distributed as C source code for Unix systems with X Windows. It is available from Ingid Jakobsen's software web site at http://acmc.uq.edu.au/DETYA/people/ibj/Retic/.


Yasuo Ina of the National Institute of Agrobiological Resources, Tsukuba, Japan developed ODEN version, a package of programs for doing distance matrix analyses on nucleotide or protein sequences. It is described in a paper: Ina, Y. 1994. ODEN: a program package for molecular evolutionary analysis and database search of DNA and amino acid sequences. Computer Applications in the Biosciences (CABIOS) 10: 11-12. It is available free by anonymous ftp from directory pub/unix/oden on ftp.dna.affrc.go.jp as C source code for Unix systems.


Angela Lüttke and Rainer Fuchs (then of the European Molecular Biology Laboratory; Fuchs is currently at Biogen, Inc., Cambridge, Massachusetts) wrote MacT, a package of programs for Mac OS Macintoshes that compute distances and compute Neighbor-Joining phylogenies for them. The programs work on 4 through 26 sequences, and source code in Microsoft QuickBasic is provided as well as compiled executables. The package is free and is available on the molecular biology software servers. For example, it is available on by anonymous ftp on the Indiana University IUBIO server ftp.bio.indiana.edu it will be found in directory soft/molbio/mac. The programs are described in a paper: Luttke, A. and R. Fuchs. 1992. MacT: Apple Macintosh programs for constructing phylogenetic trees. Computer Applications in the Biosciences 8: 591-594.


Nicholas Galtier of the University of Lyon (galtier  (at) biomserv.univ-lyon1.fr) has written Phylo_win, a "graphic interface" for molecular phylogenetic inference. It performs neighbor-joining, parsimony and maximum likelihood methods and can bootstrap with any of them. Many distances can be used including Jukes and Cantor, Kimura, Tajima and Nei, Galtier and Gouy (1995), LogDet for nucleotidic sequences, Poisson correction for protein sequences, Ka and Ks for codon sequences. Species and sites to include in the analysis are selected by mouse. Reconstructed trees can be drawn, edited, printed, stored, evaluated according to numerous criteria. Taxonomic species groups and sets of conserved regions can be defined by mouse in both tools and stored into sequence files, thus avoiding multiple data files. It is entirely mouse-driven. Most usual sequence file formats are read: CLUSTAL, FASTA, PHYLIP, MASE. It runs under X windows on many Unix workstations. It is described in the paper: Galtier, N., M. Gouy, and C. Gautier. 1996. SeaView and Phylo_win, two graphic tools for sequence alignment and molecular phylogeny. Computer Applications in the Biosciences 12: 543-548. It is distributed as C source code (to compile it one needs the NCBI Vibrant tool kit). It is also available as executables for SunOS, Solaris, SGI Unix, IBM RISC Unix, Linux, HP/UX, and DEC Alpha (Digital Unix). It can be fetched from its web page at http://pbil.univ-lyon1.fr/software/phylowin.html. It can also be obtained by anonymous ftp from biom3.univ-lyon1.fr in directory pub/mol_phylogeny. A Digital OpenVMS executable is also available as http://www.tmk.com/ftp/vms-freeware/mathog/.


F. James Rohlf has written NTSYSpc (Numerical Taxonomy System, Version 2.1), a clustering program that includes calculation of various kinds of distance measures, as well as Hierarchical clustering methods such as UPGMA as well as Neighbor-Joining and consensus trees. It can also do a variety of other things including ordination, scatter diagrams, and elliptic Fourier transforms (for shape analysis). NTSYSpc 2.1 is a Windows95 executable which will also run on Windows NT. It is available for $300 ($230 for educational and government institutions). 10-user site licensese are also available. It is distrubuted by Exeter Software (the biological software company, not the warehouse-inventory-software house of the same name). Their e-mail address is sales  (at) exetersoftware.com. Their toll-free telephone number is 800-842-5892, their not-so-free phone number is +1-631-689-7838, and their fax number is +1-631-689-0103. Their mailing address is 47 Route 25A, Suite 2, Setauket, NY 11733-2870 USA . Further information is available on their Web page at http://www.exetersoftware.com/cat/ntsyspc/ntsyspc.html.


[CAFCA icon here] Rino Zandee (zandee  (at) rulsfb.leidenuniv.nl), of the Institute of Evolutionary and Ecological Science, Van der Klaauw Laboratory, Leiden University, has written CAFCA version 1.5.12, the Collection of APL Functions for Comparative Analysis. It carries out a search for the most parsimonious tree with discrete-character data (either two-state or multistate), using a search for cliques of component compatibility (monothetic subsets) to propose the candidates for most parsimonious trees. The program is written as functions in the APL language, but Macintosh Mac OS executables are distributed. The program is free and is available from the CAFCA Web Site http://biology.leidenuniv.nl/ibl/staff/zandee/cafca/index.html.


[TREE-PUZZLE icon here] Korbinian Strimmer(korbinian.strimmer  (at) lmu.de) now at the Department of Statistics of the University of Münich, Germany, and Arndt von Haeseler, now at the Center for Integrative Bioinformatics Vienna (arndt.von.haeseler (at) &nbps;univie.ac.at) have developed TREE-PUZZLE version 5.2, (formerly called PUZZLE) a program for maximum likelihood analysis for nucleotide and amino acid alignments. They have been joined more recently by Heiko Schmidt of the John-von-Neumann Institute for Computing, Forschungszentrum Jülich (hschmidt  (at) cs.uni-duesseldorf.de. TREE-PUZZLE infers phylogenies by "quartet puzzling", a method that applies maximum likelihood tree reconstruction to all possible quartets of taxa and subsequently tries to combine most of the four-taxa maximum likelihood trees to construct an overall maximum likelihood tree. Usually there are several possible solutions. A consensus tree generated from the quartet puzzling trees shows nodes that are well supported. More details about the algorithm and on the phylogenetic accuracy can be found in the papers:

TREE-PUZZLE supports all popular models of sequence evolution of nucleotides and proteins, and can take rate heterogeneity among sites into account. It computes pairwise maximum likelihood distances for many different models of sequence evolution (TN, HKY, F84, SH, Dayhoff, JTT, mtREV24, BLOSUM62, WAG, and VT), and estimates parameters of the models. It can estimate maximum-likelihood branch-lengths for user-specified trees and perform likelihood ratio tests of clockness as well as Kishino-Hasegawa-Templeton tests. The program is written in ANSI C and is compatible with PHYLIP files. The current version also has features for parallel computation using the MPI message-passing interface, if this is available. Precompiled executables are distributed for Mac OS, Windows, and Linux. For Unix and VMS systems files for automated compilation are provided. A version capable of parallel execution is also available. TREE-PUZZLE is available from the TREE-PUZZLE web page at http://www.tree-puzzle.de. A number of places that mirrors of the distribution, or older versions, are available are listed there. Its online manual can be downloaded at http://www.tree-puzzle.de/manual.html.


Mike Holder (holder  (at) uh.edu) of the High Performance Computing Center of the University of Houston and Andrew Roger(aroger  (at) is.dal.ca) of the Department of Biochemistry and Molecular Biology of Dalhousie University, Halifax, Canada are distributing a shell script program for Unix systems, puzzleboot, version 1.03, that allows the analysis of multiple bootstrapped data sets with TREE-PUZZLE. It is designed for use with the distance matrix option of TREE-PUZZLE, to make use of the distance calculation methods. It is available from the Roger lab software page at http://rogerlab.biochemistryandmolecularbiology.dal.ca/Software/Software.htm#puzzleboot


Johan Nylander (Johan.Nylander  (at) abc.se) has written MCS version 1.0, a program that reads the output of boostrap or jackknife analyses in PAUP* and computes the Mean Character Support statistic from them. It is available as a Windows or Mac OS X executable or as source code from Nylander's software download site at http://www.ebc.uu.se/systzoo/staff/nylander.htmlin Sweden


[SplitsTree icon here] Daniel Huson (huson  (at) informatik.uni-tuebingen.de) of the ZBIT Center for Bioinformatics at the University of Tübingen, Germany and David Bryant (Bryant (at) math.auckland.ac.nz) of the University of Auckland, New Zealand, distribute a program SplitsTree for analysis of conflicts among splits implied by different quartets or different characters. It provides a number of methods for computing split networks from sequences (e.g. median networks), distances (e.g. split decomposition or neighbor-net) and trees (consensus networks and super-networks). Additionally, it contains simple combinatorial methods for computing hybridization networks and recombination networks. It can process sequence or restriction site data, and can do bootstrapping. It is discussed in the papers:

A number of versions of Splitstree are available at
the Splitstree web site at http://www.splitstree.org. These include A server is also maintained which uses SplitsTree 3.2 to analyze data submitted via its web page. distribute a program SplitsTree4 for phylogenetic analysis using trees and and recombination networks. The most recent version of the program is discussed


Igor Kuznetsov and Pavel Morozov, then of the Institute of Cytology and Genetics, Novosibirsk, Russia (Morozov is currently at the Columbia Genome Center in New York, pavel  (at) genome2.cpmc.columbia.edu) produced GEOMETRY, a package for nucleotide sequence analysis using the method of statistical geometry in sequence space (M. Eigen, R. Winkler-Oswatitsch, and A. Dress. 1988. Statistical geometry in sequence space: A method of quantitative comparative sequence analysis, Proc. Natl. Acad. Sci. USA 85: 5913-5917). The program is described in the article: Kuznetsov, I. and P. Morozov. 1996. GEOMETRY: a software package for nucleotide sequence analysis using statistical geometry in sequence space. Computer Applications in the Biosciences (CABIOS) 12: 297-301. The package uses the same data formats for sequence and tree input as the ones used in the VOSTORG package. GEOMETRY is available as a DOS executable. It is available for downloading by ftp from the EMBL file server ftp.ebi.ac.uk in directory pub/software/dos as file geom.zip.


Vincent Berry of the LIRMM, Université de Montpellier, France (vberry  (at) lirmm.fr) has released PhyloQuart version 1.3, a package of programs inferring phylogenies from quartets. It is able to use either nucleotide sequences or distances. It implements the Q* method of tree reconstruction, which is inspired by the work of Bandelt and Dress, and is described in the paper: Berry, V. and O. Gascuel. 2000. Inferring evolutionary trees with strong combinatorial evidence. Theoretical Computer Science 240: 271-298. PhyloQuart is available as C source code which can be compiled on Unix systems, from its web site at http://www.lirmm.fr/~vberry/PHYLOQUART/phyloquart.html. PhyloQuart is also available as a Web server from the server of the Institut Pasteur.


Le Sy Vinh (vinh (at) cs.uni-duesseldorf.de) and Arndt von Haeseler, now of the Center for Integrative Bioinformatics Vienna (arndt.von.haeseler (at) &nbps;univie.ac.at) have released IQPNNI versions 2.6 and 3.0β1, Important Quartet Puzzling and NNI Operation. This program uses selected quartets called Important Quartets of species to build a preliminary tree, rearrange it using the maximum likelihood criterion by nearest-neighbor interchanges, and then use further examination of quartets to remove and reposition some of the species. It is decsribed in a paper: Vinh, L. S. and A. von Haeseler. 2004. IQPNNI: Moving fast through tree space and stopping in time. Molecular Biology and Evolution 21: 1565-1571. It is available as binary executables (including a version that works with MPI parallel execution) and source code from its web site at http://www.bi.uni-duesseldorf.de/software/iqpnni/


Stephen J. Willson (swillson  (at) iastate.edu) of the Department of Mathematics, Iowa State University, has produced a package of programs to infer phylogenies from quartets of species. They infer phylogenies of individual quartets by parsimony, and in combining them use information on how strongly the phylogeny for that quartet is preferred over its alternatives, or by measures of how well the group fits into a given placement on a tree, as judged by quartets. The methods are described in two papers:

The programs are in C and are described as having successfully been compiled on Mac OS systems using the Codewarrior C compiler. Mac OS executables are also provided. The programs are available at Willson's software web site at http://www.public.iastate.edu/~swillson/software.html.


James Lake of the Department of Molecular, Cell and Developmental Biology of the University of California, Los Angeles (lake  (at) mbi.ucla.edu) has released Gambit, which implements a method called Boostrapper's Gambit. The method involves bootstrap sampling sequences, computing trees for quartets of species, and assembling larger trees out of quartets that have significant boostrap support. One of the methods available to estimate trees from the quartets is paralinear (LogDet) distances. Other distance methods and parsimony are also available. The Bootstrapper's Gambit method is described in a paper: Lake, J. A. 1995. Calculating the probability of multitaxation evolutionary trees: Bootstrappers gambit. Proceedings of the National Academy of Sciences, USA 92: 9662-9666. The program is available as a DOS executable, free as a beta release to noncommercial users on a trial basis until January 1, 2003. (It is unclear from the web site whether a free version is to be available to noncommercial users after that point -- a previous deadline was extended). Commercial users are asked to pay $50 on a shareware basis. The program is available at its web site at http://genomics.ucla.edu/gambit/.


  Arne Röhl, Peter Forster, and Hans-Jürgen Bandelt (Forster is at The McDonald Institute for Archaeological Research, University of Cambridge, U.K., e-mail address pf223  (at) cam.ac.uk, and Bandelt is at the Mathematisches Seminar, University of Hamburg, Bundesstrasse 55, 20146 Hamburg, Germany, e-mail address bandelt  (at) math.uni-hamburg.de) have written Network version 4.109, a program to infer networks (which have more connections than trees) from non-recombining DNA, STR, amino acid, and RFLP data. The networks are either reduced median networks or median-joining networks, method which are described in the papers:

The program is available for free as a Windows executable (it expires after a time, but a new free version is intended to be available by then), or an older DOS executable (version 2.10b) from Fluxus Engineering at its web site at http://www.fluxus-engineering.com/sharenet.htm.


  Mike Hendy, Katharina T. Huber, Michael Langton, Vincent Moulton, and David Penny have written Spectronet version 1.27, a program that computes a collection of weighted splits or partitions and allows the user to interactively analyze the results with a series of tools. Hendy and Penny are at Massey University, New Zealand (m.hendy (at) massey.ac.nz and d.penny  (at) massey.ac.nz), Huber and Moulton are at the School of Computational Science of the University of East Anglia, U.K. (Katharina.Huber  (at) cmp.uea.ac.uk and Vincent.Moulton (at) cmp.uea.ac.uk). Spectronet can read molecular sequence or discrete character data, compute splits by Hadamard conjugation or directly, compute and display compatibility matrices of characters, make reduced median networks, and plot networks by making a Lentoplot. Spectronet is described in a paper: Huber, K. T., M. Langton, D. Penny, V. Moulton and M. Hendy. 2002. Spectronet: A package for computing spectra and median networks. Applied Bioinformatics 1: 159-161. It is available as a Windows executable from its web site at http://awcmee.massey.ac.nz/spectronet/index.html.


Steven Kelk, Leo van Iersel, Judith Keijsper, and Leen Stougie of the Centrum voor Wiskunde en Informatica (CWI) and Technische Universiteit Eindhoven (TU/e), Netherlands (S.M.Kelk (at) cwi.nl) have produced LEVEL2 version 0.91, which constructs level-2 phylogenetic networks from dense sets of rooted triplets. This program takes as input a dense set of rooted triplets and attempts to construct a level-2 phylogenetic network from them (or level-1, or level-0, if level-2 is not necessary). Triplets are the rooted analogue of quartets, and a dense set of triplets is one where for every subset of three taxa there is at least one triplet. A level-k phylogenetic network is a rooted phylogenetic network where every biconnected component in the underlying, undirected graph contains at most k recombination vertices. The program produces an image of the resulting network, if it is found. It is described in the paper: van Iersel, L., J. Keijsper, S. Kelk, and L. Stougie. 2007. Constructing level-2 phylogenetic networks from triplets. arXiv:0707.2890v1 [q-bio.PE]. It is available as Java source code, and also requires that the DOT graph description package be installed. It can be downloaded from its web site at http://homepages.cwi.nl/~kelk/level2triplets.html


Luay Nakhleh, Derek Ruths, and Cuong Than of the Department of Computer Science of the Rice University, Houston, Texas (nakhleh (at) cs.rice.edu) have released PhyloNet (Phylogenetic Network Analysis ), version 1.5, a phylogeny package with tools for reconstructing and analyzing phylogenetic networks. It has programs for inferring horizontal gene transfer events, by estimating the SPR distance between two trees (along with a bootstrap-based measure of support), and interspecific recombination, by using maximum parsimony. It also has tools for enumerating the trees and clusters of taxa within a given network, comparing the topologies of networks, estimating the strain tree of bacterial genomes from multi-locus data, and enumerating valid coalescent histories of a gene tree within the branches of a species tree. It is described in the paper: Than, C., D. Ruths, and L. Nakhleh, 2008. PhyloNet: A Software Package for Analyzing and Reconstructing Reticulate Evolutionary Relationships. Under Review. It is available as Java executables and Mac OS X universal executables. It can be downloaded from its web site at http://bioinfo.cs.rice.edu/phylonet/index.html


Guohua Jin and Luay Nakhleh of the Department of Computer Science of Rice University, Houston, Texas (jin (at) cs.rice.edu and nakhleh (at) cs.rice.edu) have produced NEPAL (NEtwork Parsimony And Likelihood), version 1.0, a suite of tools for reconstructing and analyzing reticulate (non-treelike) evolutionary relationships using the maximum parsimony and maximum likelihood criteria. It is used to identify horizontal gene (or partial gene) transfers between species. NEPAL reads in a species tree in Newick format or a network from NEPAL or RIATA-HGT output, and sequence data. It returns the maximum parsimony or maximum likelihood score of the input or generated trees or networks. The user can control the number of additional edges added to the input tree. The methods are described in the papers:

It is available as Linux executables. It can be downloaded from its web site at http://bioinfo.cs.rice.edu/nepal/index.html


Rasmus Nielsen, of the Centre for Bioinformatics of the University of Copenhagen, Denmark (rasmus  (at) binf.ku.dk) has released IM, a program that estimates divergence time between two populations along with the population sizes before and after divergence, as well as the migration rate between the two populations after divergence. The program uses Markov chain Monte Carlo (MCMC) coalescent methods. It is described in a paper: Hey, J., and R. Nielsen. 2004. Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics 167: 747-760. It allows Bayesian inference from a number of loci, each assumed to be without intra-locus recombination. It can use a DNA mutation model, a stepwise microsatellite mutation model, or an infinite-sites model. The program estimates the three population sizes, the time of divergence, and the two mutation rates, each relative to the mutation rate. It can also infer an asymmetric division of the ancestral population at the time of speciation, with subsequent linear growth of each population to its current size. IM is distributed as a Windows executable with generic C source code that will also work on Unix. It is available from its web page at the Hey lab web site, http://lifesci.rutgers.edu/~heylab/HeylabSoftware.htm#IM


Liang Liu of the Department of Organismic and Evolutionary Biology of the Harvard University, Cambridge, Massachusetts (lliu (at) oeb.harvard.edu) has released BEST (Bayesian Estimation of Species Trees), version 1.6, a program used in conjunction with MrBayes to estimate the posterior distribution of species trees from samples of multilocus sequences within species. It is intended to implement the Bayesian hierarchical model proposed by Liang Liu and Dennis Pearl and further developed in collaboration with Scott Edwards. This involves two consecutive Markov Chain Monte Carlo (MCMC) procedures, the first one performed in a revised version of MrBayes in which a new function is added to approximate the joint probability of gene trees from one species tree The output gene trees (multilocus) then form the input file of the second MCMC program BEST which uses importance sampling to infer the species tree. It is described in the papers:

It is available as C source code, Windows executables and Mac OS X universal executables. It can be downloaded from its web site at http://www.stat.osu.edu/~dkp/BEST


Marty J. Wolf of Bemidji State University, Minnesota (mjwolf  (at) cs.bemidjistate.edu) has written, and he and Lars Sommer Jermiin distribute, TrExMl, which searches tree space for DNA sequence data to find not only the maximum likelihood tree but also trees of other topologies which are nearly as good. TrExMl can also carry out bootstrapping of the sequences before doing the analysis. It is described in a paper: Wolf M. J., S. Easteal, M. Kahn, B. D. McKay, and L. S. Jermiin. 2000. TrExML: A maximum likelihood program for extensive tree-space exploration. Bioinformatics 16: 383-394. TrExMl is described in its web page at http://whitetail.bemidji.msus.edu/trexml/trexml.man.html. It is distributed from there as C source code. One use will be along with Lars Sommer Jermiin's program TreeCons which computes a weighted average of trees according to their likelihood values.


Andrew Roger, of the Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada (aroger (at) is.dal.ca) has written ELW (Expected Likelihood Weights), two PERL scripts -- elw.pl and calcwts.pl -- that, together with PAUP* and the PHYLIP program Seqboot can be used to implement the "expected likelihood weights" method of Strimmer and Rambaut, described in the paper by Strimmer, K. and A. Rambaut. 2002. Inferring confidence sets of possibly misspecified gene trees. Proceedings of the Royal Society of London Series B 269: 137-142. It calculates a confidence interval for the maximum likelihood tree using the variation of the likelihoods among bootstrap estimates of the tree. ELW can be downloaded from its entry on Roger's software web page at http://rogerlab.biochemistryandmolecularbiology.dal.ca/Software/Software.htm#elw


James McInerney of the Department of Biology of the National University of Ireland, Maynooth, County Kildare, Ireland (james.o.mcinerney  (at) may.ie) and also of the Department of Zoology of the Natural History Museum, London, U.K. (j.mcinerney  (at) nhm.ac.uk) has written PHYCON, a program which takes as input bootstrapped molecular data sets, as produced by PHYLIP and feeds them to MOLPHY programs. This allows bootstrapping within PAML. The program is available as C source code for Unix; it will not work on a Windows system or under Mac OS (though it can under Mac OS X). Source code and documentation are available from its web site at http://www.bioinf.org/vibe/software/phycon/phycon.html.


Naoko Takezaki (ntakezak  (at) lab.nig.ac.jp) of the Center for Information Biology of the National Institute of Genetics, Mishima, Japan, has written Lintre (Phylogenetic tests of the molecular clock and linearized tree), a package of programs for Sun workstations. The programs include:

The two-cluster test is essentially the relative rate test for many sequences. The branch length test is the test of rate difference for each sequence under the tree root from the average rate of all sequences. The tests are described in: Takezaki, N., A. Rzhetsky, and M. Nei. 1995. Phylogenetic test of the molecular clock and linearized trees. Molecular Biology and Evolution 12: 823-33. The programs are available as C source code and also as DOS executables. They are distributed (as a compressed tar archive of the source code with examples and documentation, and also as a self-extracting archive of sources and DOS executable) from from its ftp site at ftp://ftp.nig.ac.jp/pub/Bio/lintre/, and also by ftp from the IUBio archive at http://iubio.bio.indiana.edu/soft/molbio/evolve/lintr/. They are also available at the Nei lab software web site at http://www.bio.psu.edu/People/Faculty/Nei/Lab/software.htm.


   Thomas Wilcox, formerly of the Center for Computational Biology and Informatics of the University of Texas, and more recently of Long Key Tropical Research Center, Florida (tpwilcox (at) lktrc.org) has produced Cadence version 1.0.1, a program for Bayesian relative rate tests. It is described in the paper: Wilcox, T. P., F. J.García de Leon, D. A. Hendrickson, and D. M. Hillis. 2004. Convergence among cave catfishes: Long-branch attraction and a Bayesian relative rates test. Molecular Phylogenetics and Evolution 31: 1101-1113. It is available as Powermac Mac OS X executables. It can be downloaded from its web site at the University of Texas and its web site at mac.com at http://www.zo.utexas.edu/faculty/antisense/Download.html and http://homepage.mac.com/tpwilcox/FileSharing15.html


Naoko Takezaki (ntakezak  (at) lab.nig.ac.jp) of the Center for Information Biology of the National Institute of Genetics, Mishima, Japan, has written POPTREE which constructs a neighbor-joining tree or a UPGMA tree from microsatellite data and other allele frequency data. Bootstrapping can be carried out. The program includes Goldstein et al.'s distance for microsatellite loci. There are a source code Unix version and an executable DOS version (which is called poptrfdos). They are available by ftp from the IU