
Katharina Huber,
Leo van Iersel,
Vincent Moulton,
Celine Scornavacca and
Taoyang Wu. Reconstructing phylogenetic level1 networks from nondense binet and trinet sets. In ALG, Vol. 77(1):173200, 2017. Keywords: explicit network, FPT, from binets, from trinets, NP complete, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1411.6804.





Leo van Iersel,
Steven Kelk,
Giorgios Stamoulis,
Leen Stougie and
Olivier Boes. On unrooted and rootuncertain variants of several wellknown phylogenetic network problems. In ALG, 2017. Keywords: explicit network, FPT, from network, from unrooted trees, NP complete, phylogenetic network, phylogeny, reconstruction, tree containment. Note: https://hal.inria.fr/hal01599716, to appear.







Sha Zhu and
James H. Degnan. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent. In SB, Vol. 66(2):283298, 2017. Keywords: branch length, coalescent, explicit network, from network, likelihood, phylogenetic network, phylogeny, Program Hybridcoal, Program HybridLambda, Program PhyloNet, software, uniqueness. Note: presentation available at https://www.youtube.com/watch?v=JLYGTfEZG7g.





Misagh Kordi and
Mukul S. Bansal. On the Complexity of DuplicationTransferLoss Reconciliation with NonBinary Gene Trees. In TCBB, Vol. 14(3):587599, 2017. Keywords: duplication, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://compbio.engr.uconn.edu/papers/Kordi_DTLreconciliationPreprint2015.pdf.



Andreas Gunawan,
Bhaskar DasGupta and
Louxin Zhang. A decomposition theorem and two algorithms for reticulationvisible networks. In Information and Computation, Vol. 252:161175, 2017. Keywords: cluster containment, explicit network, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, reticulationvisible network, tree containment.. Note: https://www.cs.uic.edu/~dasgupta/resume/publ/papers/Infor_Comput_IC4848_final.pdf.







Magnus Bordewich,
Charles Semple and
Nihan Tokac. Constructing treechild networks from distance matrices. In Algorithmica, 2017. Keywords: compressed network, explicit network, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, tree child network, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSN17.pdf, to appear.



Celine Scornavacca,
Joan Carles Pons and
Gabriel Cardona. Fast algorithm for the reconciliation of gene trees and LGT networks. In JTB, Vol. 418:129137, 2017. Keywords: duplication, explicit network, from network, from rooted trees, lateral gene transfer, LGT network, loss, parsimony, phylogenetic network, phylogeny, polynomial, reconstruction.



Leo van Iersel,
Vincent Moulton,
Eveline De Swart and
Taoyang Wu. Binets: fundamental building blocks for phylogenetic networks. In BMB, Vol. 79(5):11351154, 2017. Keywords: approximation, explicit network, from binets, galled tree, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1007/s1153801702754.





Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Solving the Tree Containment Problem in Linear Time for Nearly Stable Phylogenetic Networks. In DAM, 2017. Keywords: explicit network, from network, from rooted trees, nearlystable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://halupecupem.archivesouvertes.fr/hal01575001/en/, to appear.



Philippe Gambette,
Leo van Iersel,
Mark Jones,
Manuel Lafond,
Fabio Pardi and
Celine Scornavacca. Rearrangement Moves on Rooted Phylogenetic Networks. In PLoS Computational Biology, Vol. 13(8):e1005611.121, 2017. Keywords: distance between networks, explicit network, from network, NNI distance, phylogenetic network, phylogeny, SPR distance. Note: https://halupecupem.archivesouvertes.fr/hal01572624/en/.





Sarah Bastkowski,
Daniel Mapleson,
Andreas Spillner,
Taoyang Wu,
Monika Balvociute and
Vincent Moulton. SPECTRE: a Suite of PhylogEnetiC Tools for Reticulate Evolution. 2017. Keywords: abstract network, NeighborNet, phylogenetic network, phylogeny, Program FlatNJ, Program QNet, Program SplitsTree, reconstruction, software, split network. Note: https://doi.org/10.1101/169177.



Leo van Iersel,
Steven Kelk,
Nela Lekic,
Chris Whidden and
Norbert Zeh. Hybridization Number on Three Rooted Binary Trees is EPT. In SIDMA, Vol. 30(3):16071631, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1402.2136.



Katharina Huber,
Vincent Moulton,
Mike Steel and
Taoyang Wu. Folding and unfolding phylogenetic trees and networks. In JOMB, Vol. 73(6):17611780, 2016. Keywords: compressed network, explicit network, FUstable network, NP complete, phylogenetic network, phylogeny, tree containment, tree sibling network. Note: http://arxiv.org/abs/1506.04438.





Steven Kelk,
Leo van Iersel,
Celine Scornavacca and
Mathias Weller. Phylogenetic incongruence through the lens of Monadic Second Order logic. In JGAA, Vol. 20(2):189215, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, MSOL, phylogenetic network, phylogeny, reconstruction. Note: http://jgaa.info/accepted/2016/KelkIerselScornavaccaWeller2016.20.2.pdf.





Sajad Mirzaei and
Yufeng Wu. Fast Construction of Near Parsimonious Hybridization Networks for Multiple Phylogenetic Trees. In TCBB, Vol. 13(3):565570, 2016. Keywords: bound, explicit network, from rooted trees, heuristic, phylogenetic network, phylogeny, Program PIRN, reconstruction, software. Note: http://www.engr.uconn.edu/~ywu/Papers/PIRNspreprint.pdf.





Vincent Ranwez,
Celine Scornavacca,
JeanPhilippe Doyon and
Vincent Berry. Inferring gene duplications, transfers and losses can be done in a discrete framework. In JOMB, Vol. 72(7):18111844, 2016. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction.





François Chevenet,
JeanPhilippe Doyon,
Celine Scornavacca,
Edwin Jacox,
Emmanuelle Jousselin and
Vincent Berry. SylvX: a viewer for phylogenetic tree reconciliations. In BIO, Vol. 32(4):608610, 2016. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, Program SylvX, software, visualization. Note: https://www.researchgate.net/profile/Emmanuelle_Jousselin/publication/283446016_SylvX_a_viewer_for_phylogenetic_tree_reconciliations/links/5642146108aec448fa621efa.pdf.



Hussein A. Hejase and
Kevin J. Liu. A scalability study of phylogenetic network inference methods using empirical datasets and simulations involving a single reticulation. Vol. 17(422):112, 2016. Keywords: abstract network, evaluation, from sequences, phylogenetic network, phylogeny, Program PhyloNet, Program PhyloNetworks SNaQ, reconstruction, simulation, unicyclic network. Note: http://dx.doi.org/10.1186/s1285901612771.



Philippe Gambette,
Leo van Iersel,
Steven Kelk,
Fabio Pardi and
Celine Scornavacca. Do branch lengths help to locate a tree in a phylogenetic network? In BMB, Vol. 78(9):17731795, 2016. Keywords: branch length, explicit network, FPT, from network, from rooted trees, NP complete, phylogenetic network, phylogeny, pseudopolynomial, time consistent network, tree containment, tree sibling network. Note: http://arxiv.org/abs/1607.06285.









Maria Anaya,
Olga AnipchenkoUlaj,
Aisha Ashfaq,
Joyce Chiu,
Mahedi Kaiser,
Max Shoji Ohsawa,
Megan Owen,
Ella Pavlechko,
Katherine St. John,
Shivam Suleria,
Keith Thompson and
Corrine Yap. On Determining if Treebased Networks Contain Fixed Trees. In BMB, Vol. 78(5):961969, 2016. Keywords: explicit network, FPT, NP complete, phylogenetic network, phylogeny, treebased network. Note: http://arxiv.org/abs/1602.02739.















Juan Wang. A Survey of Methods for Constructing Rooted Phylogenetic Networks. In PLoS ONE, Vol. 11(11):e0165834, 2016. Keywords: evaluation, explicit network, from clusters, phylogenetic network, phylogeny, Program BIMLR, Program Dendroscope, Program LNetwork, reconstruction, survey. Note: http://dx.doi.org/10.1371/journal.pone.0165834.







Katharina Huber,
Leo van Iersel,
Vincent Moulton and
Taoyang Wu. How much information is needed to infer reticulate evolutionary histories? In Systematic Biology, Vol. 64(1):102111, 2015. Keywords: explicit network, from network, from rooted trees, from trinets, identifiability, phylogenetic network, phylogeny, reconstruction, uniqueness. Note: http://dx.doi.org/10.1093/sysbio/syu076.









Sha Zhu,
James H. Degnan,
Sharyn J. Goldstein and
Bjarki Eldon. HybridLambda: simulation of multiple merger and Kingman gene genealogies in species networks and species trees. In BMCB, Vol. 16(292):17, 2015. Keywords: explicit network, from network, phylogenetic network, phylogeny, Program HybridLambda, simulation, software. Note: http://dx.doi.org/10.1186/s128590150721y.



Gergely J. Szöllösi,
Adrián Arellano Davín,
Eric Tannier,
Vincent Daubin and
Bastien Boussau. Genomescale phylogenetic analysis finds extensive gene transfer among fungi. In Philosophical Transactions of the Royal Society of London B: Biological Sciences, Vol. 370(1678):111, 2015. Keywords: duplication, from sequences, lateral gene transfer, loss, phylogenetic network, phylogeny, Program ALE, reconstruction. Note: http://dx.doi.org/10.1098/rstb.2014.0335.





Jessica W. Leigh and
David Bryant. PopART: fullfeature software for haplotype network construction. In MEE, Vol. 6(9):1110–1116, 2015. Keywords: abstract network, from sequences, haplotype network, MedianJoining, phylogenetic network, phylogeny, population genetics, Program PopART, Program TCS, software. Note: http://dx.doi.org/10.1111/2041210X.12410.



Gabriel Cardona,
Joan Carles Pons and
Francesc Rosselló. A reconstruction problem for a class of phylogenetic networks with lateral gene transfers. In ALMOB, Vol. 10(28):115, 2015. Keywords: explicit network, from rooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program LGTnetwork, reconstruction, software, treebased network. Note: http://dx.doi.org/10.1186/s130150150059z.



Leo van Iersel,
Steven Kelk,
Nela Lekic and
Leen Stougie. Approximation algorithms for nonbinary agreement forests. In SIDMA, Vol. 28(1):4966, 2014. Keywords: agreement forest, approximation, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1210.3211.
Toggle abstract
"Given two rooted phylogenetic trees on the same set of taxa X, the Maximum Agreement Forest (maf) problem asks to find a forest that is, in a certain sense, common to both trees and has a minimum number of components. The Maximum Acyclic Agreement Forest (maaf) problem has the additional restriction that the components of the forest cannot have conflicting ancestral relations in the input trees. There has been considerable interest in the special cases of these problems in which the input trees are required to be binary. However, in practice, phylogenetic trees are rarely binary, due to uncertainty about the precise order of speciation events. Here, we show that the general, nonbinary version of maf has a polynomialtime 4approximation and a fixedparameter tractable (exact) algorithm that runs in O(4opoly(n)) time, where n = X and k is the number of components of the agreement forest minus one. Moreover, we show that a capproximation algorithm for nonbinary maf and a dapproximation algorithm for the classical problem Directed Feedback Vertex Set (dfvs) can be combined to yield a d(c+3)approximation for nonbinary maaf. The algorithms for maf have been implemented and made publicly available. © 2014 Society for Industrial and Applied Mathematics."



Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. The comparison of treesibling time consistent phylogenetic networks is graphisomorphism complete. In The Scientific World Journal, Vol. 2014(254279):16, 2014. Keywords: abstract network, distance between networks, from network, isomorphism, phylogenetic network, tree sibling network. Note: http://arxiv.org/abs/0902.4640.
Toggle abstract
"Several polynomial time computable metrics on the class of semibinary treesibling time consistent phylogenetic networks are available in the literature; in particular, the problem of deciding if two networks of this kind are isomorphic is in P. In this paper, we show that if we remove the semibinarity condition, then the problem becomes much harder. More precisely, we prove that the isomorphism problem for generic treesibling time consistent phylogenetic networks is polynomially equivalent to the graph isomorphism problem. Since the latter is believed not to belong to P, the chances are that it is impossible to define a metric on the class of all treesibling time consistent phylogenetic networks that can be computed in polynomial time. © 2014 Gabriel Cardona et al."



Steven Kelk and
Celine Scornavacca. Constructing minimal phylogenetic networks from softwired clusters is fixed parameter tractable. In ALG, Vol. 68(4):886915, 2014. Keywords: explicit network, FPT, from clusters, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1108.3653.
Toggle abstract
"Here we show that, given a set of clusters C on a set of taxa X, where X=n, it is possible to determine in time f(k)×poly(n) whether there exists a level≤k network (i.e. a network where each biconnected component has reticulation number at most k) that represents all the clusters in C in the softwired sense, and if so to construct such a network. This extends a result from Kelk et al. (in IEEE/ACM Trans. Comput. Biol. Bioinform. 9:517534, 2012) which showed that the problem is polynomialtime solvable for fixed k. By defining "kreticulation generators" analogous to "levelk generators", we then extend this fixed parameter tractability result to the problem where k refers not to the level but to the reticulation number of the whole network. © 2012 Springer Science+Business Media New York."



Hadi Poormohammadi,
Changiz Eslahchi and
Ruzbeh Tusserkani. TripNet: A Method for Constructing Rooted Phylogenetic Networks from Rooted Triplets. In PLoS ONE, Vol. 9(9):e106531, 2014. Keywords: explicit network, from triplets, heuristic, level k phylogenetic network, phylogenetic network, phylogeny, Program TripNet, reconstruction, software. Note: http://arxiv.org/abs/1201.3722.
Toggle abstract
"The problem of constructing an optimal rooted phylogenetic network from an arbitrary set of rooted triplets is an NPhard problem. In this paper, we present a heuristic algorithm called TripNet, which tries to construct a rooted phylogenetic network with the minimum number of reticulation nodes from an arbitrary set of rooted triplets. Despite of current methods that work for dense set of rooted triplets, a key innovation is the applicability of TripNet to nondense set of rooted triplets. We prove some theorems to clarify the performance of the algorithm. To demonstrate the efficiency of TripNet, we compared TripNet with SIMPLISTIC. It is the only available software which has the ability to return some rooted phylogenetic network consistent with a given dense set of rooted triplets. But the results show that for complex networks with high levels, the SIMPLISTIC running time increased abruptly. However in all cases TripNet outputs an appropriate rooted phylogenetic network in an acceptable time. Also we tetsed TripNet on the Yeast data. The results show that Both TripNet and optimal networks have the same clustering and TripNet produced a level3 network which contains only one more reticulation node than the optimal network."



Leo van Iersel and
Vincent Moulton. Trinets encode treechild and level2 phylogenetic networks. In JOMB, Vol. 68(7):17071729, 2014. Keywords: explicit network, from trinets, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1210.0362.
Toggle abstract
"Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that level1 phylogenetic networks are encoded by their trinets, and also conjectured that all "recoverable" rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level2 networks and binary treechild networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets. © 2013 SpringerVerlag Berlin Heidelberg."



Anthony Labarre and
Sicco Verwer. Merging partially labelled trees: hardness and a declarative programming solution. In TCBB, Vol. 11(2):389397, 2014. Keywords: abstract network, from unrooted trees, heuristic, NP complete, phylogenetic network, phylogeny, reconstruction. Note: https://halupecupem.archivesouvertes.fr/hal00855669.
Toggle abstract
"Intraspecific studies often make use of haplotype networks instead of gene genealogies to represent the evolution of a set of genes. Cassens et al. proposed one such network reconstruction method, based on the global maximum parsimony principle, which was later recast by the first author of the present work as the problem of finding a minimum common supergraph of a set of t partially labelled trees. Although algorithms have been proposed for solving that problem on two graphs, the complexity of the general problem on trees remains unknown. In this paper, we show that the corresponding decision problem is NPcomplete for t=3. We then propose a declarative programming approach to solving the problem to optimality in practice, as well as a heuristic approach, both based on the idpsystem, and assess the performance of both methods on randomly generated data. © 20042012 IEEE."





Leo van Iersel and
Steven Kelk. Kernelizations for the hybridization number problem on multiple nonbinary trees. In WG14, Vol. 8747:299311 of LNCS, springer, 2014. Keywords: explicit network, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Treeduce, reconstruction. Note: http://arxiv.org/abs/1311.4045.



Ward C Wheeler. Phyletic groups on networks. In Cladistics, Vol. 30(4):447451, 2014. Keywords: explicit network, from network, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1111/cla.12062.
Toggle abstract
"Three additional phyletic group types, "periphyletic," "epiphyletic", and "anaphyletic" (in addition to Hennigian mono, para, and polyphyletic) are defined in terms of trees and phylogenetic networks (trees with directed reticulate edges) via a generalization of the algorithmic definitions of Farris. These designations concern groups defined as monophyletic on trees, but with additional gains or losses of members from network edges. These distinctions should be useful in discussion of systems with nonvertical inheritance such as recombination between viruses, horizontal exchange between bacteria, hybridization in plants and animals, as well as human linguistic evolution. Examples are illustrated with IndoEuropean language groups. © The Willi Hennig Society 2013."



Sarah Bastkowski,
Andreas Spillner and
Vincent Moulton. Fishing for minimum evolution trees with NeighborNets. In IPL, Vol. 114(12):318, 2014. Keywords: circular split system, from distances, NeighborNet, phylogeny, polynomial.
Toggle abstract
"In evolutionary biology, biologists commonly use a phylogenetic tree to represent the evolutionary history of some set of species. A common approach taken to construct such a tree is to search through the space of all possible phylogenetic trees on the set so as to find one that optimizes some score function, such as the minimum evolution criterion. However, this is hampered by the fact that the space of phylogenetic trees is extremely large in general. Interestingly, an alternative approach, which has received somewhat less attention in the literature, is to instead search for trees within some set of bipartitions or splits of the set of species in question. Here we consider the problem of searching through a set of splits that is circular. Such sets can, for example, be generated by the NeighborNet algorithm for constructing phylogenetic networks. More specifically, we present an O(n4) time algorithm for finding an optimal minimum evolution tree in a circular set of splits on a set of species of size n. In addition, using simulations, we compare the performance of this algorithm when applied to NeighborNet output with that of FastME, a leading method for searching for minimum evolution trees in tree space. We find that, even though a circular set of splits represents just a tiny fraction of the total number of possible splits of a set, the trees obtained from circular sets compare quite favorably with those obtained with FastME, suggesting that the approach could warrant further investigation. © 2013 Elsevier B.V."



Kevin J. Liu,
Jingxuan Dai,
Kathy Truong,
Ying Song,
Michael H. Kohn and
Luay Nakhleh. An HMMBased Comparative Genomic Framework for Detecting Introgression in Eukaryotes. In PLoS ONE, Vol. 10(6):e1003649, 2014. Keywords: explicit network, from network, phylogenetic network, phylogeny, Program PhyloNetHMM. Note: http://arxiv.org/abs/1310.7989.
Toggle abstract
"One outcome of interspecific hybridization and subsequent effects of evolutionary forces is introgression, which is the integration of genetic material from one species into the genome of an individual in another species. The evolution of several groups of eukaryotic species has involved hybridization, and cases of adaptation through introgression have been already established. In this work, we report on PhyloNetHMMa new comparative genomic framework for detecting introgression in genomes. PhyloNetHMM combines phylogenetic networks with hidden Markov models (HMMs) to simultaneously capture the (potentially reticulate) evolutionary history of the genomes and dependencies within genomes. A novel aspect of our work is that it also accounts for incomplete lineage sorting and dependence across loci. Application of our model to variation data from chromosome 7 in the mouse (Mus musculus domesticus) genome detected a recently reported adaptive introgression event involving the rodent poison resistance gene Vkorc1, in addition to other newly detected introgressed genomic regions. Based on our analysis, it is estimated that about 9% of all sites within chromosome 7 are of introgressive origin (these cover about 13 Mbp of chromosome 7, and over 300 genes). Further, our model detected no introgression in a negative control data set. We also found that our model accurately detected introgression and other evolutionary processes from synthetic data sets simulated under the coalescent model with recombination, isolation, and migration. Our work provides a powerful framework for systematic analysis of introgression while simultaneously accounting for dependence across sites, point mutations, recombination, and ancestral polymorphism. © 2014 Liu et al."







Leo van Iersel,
Steven Kelk,
Nela Lekic and
Celine Scornavacca. A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees. In BMCB, Vol. 15(127):112, 2014. Keywords: agreement forest, approximation, explicit network, from rooted trees, phylogenetic network, phylogeny, Program CycleKiller, Program TerminusEst, reconstruction. Note: http://dx.doi.org/10.1186/1471210515127.



JohannMattis List,
Shijulal NelsonSathi,
Hans Geisler and
William Martin. Networks of lexical borrowing and lateral gene transfer in language and genome evolution. In BioEssays, Vol. 36(2):141150, 2014. Keywords: explicit network, minimal lateral network, phylogenetic network, Program lingpy. Note: http://dx.doi.org/10.1002/bies.201300096.
Toggle abstract
"Like biological species, languages change over time. As noted by Darwin, there are many parallels between language evolution and biological evolution. Insights into these parallels have also undergone change in the past 150 years. Just like genes, words change over time, and language evolution can be likened to genome evolution accordingly, but what kind of evolution? There are fundamental differences between eukaryotic and prokaryotic evolution. In the former, natural variation entails the gradual accumulation of minor mutations in alleles. In the latter, lateral gene transfer is an integral mechanism of natural variation. The study of language evolution using biological methods has attracted much interest of late, most approaches focusing on language tree construction. These approaches may underestimate the important role that borrowing plays in language evolution. Network approaches that were originally designed to study lateral gene transfer may provide more realistic insights into the complexities of language evolution. Editor's suggested further reading in BioEssays Linguistic evidence supports date for Homeric epics. © 2014 The Authors. BioEssays Published by WILEY Periodicals, Inc."





Juan Wang. A new algorithm to construct phylogenetic networks from trees. In Genetics and Molecular Research, Vol. 13(1):14561464, 2014. Keywords: explicit network, from clusters, heuristic, phylogenetic network, Program LNetwork, Program QuickCass, reconstruction. Note: http://dx.doi.org/10.4238/2014.March.6.4.
Toggle abstract
"Developing appropriate methods for constructing phylogenetic networks from tree sets is an important problem, and much research is currently being undertaken in this area. BIMLR is an algorithm that constructs phylogenetic networks from tree sets. The algorithm can construct a much simpler network than other available methods. Here, we introduce an improved version of the BIMLR algorithm, QuickCass. QuickCass changes the selection strategy of the labels of leaves below the reticulate nodes, i.e., the nodes with an indegree of at least 2 in BIMLR. We show that QuickCass can construct simpler phylogenetic networks than BIMLR. Furthermore, we show that QuickCass is a polynomialtime algorithm when the output network that is constructed by QuickCass is binary. © FUNPECRP."



Matthieu Willems,
Nadia Tahiri and
Vladimir Makarenkov. A new efficient algorithm for inferring explicit hybridization networks following the NeighborJoining principle. In JBCB, Vol. 12(5), 2014. Keywords: explicit network, from distances, heuristic, phylogenetic network, phylogeny, reconstruction.
Toggle abstract
"Several algorithms and software have been developed for inferring phylogenetic trees. However, there exist some biological phenomena such as hybridization, recombination, or horizontal gene transfer which cannot be represented by a tree topology. We need to use phylogenetic networks to adequately represent these important evolutionary mechanisms. In this article, we present a new efficient heuristic algorithm for inferring hybridization networks from evolutionary distance matrices between species. The famous NeighborJoining concept and the leastsquares criterion are used for building networks. At each step of the algorithm, before joining two given nodes, we check if a hybridization event could be related to one of them or to both of them. The proposed algorithm finds the exact tree solution when the considered distance matrix is a tree metric (i.e. it is representable by a unique phylogenetic tree). It also provides very good hybrids recovery rates for large trees (with 32 and 64 leaves in our simulations) for both distance and sequence types of data. The results yielded by the new algorithm for real and simulated datasets are illustrated and discussed in detail. © Imperial College Press."



Paul Cordue,
Simone Linz and
Charles Semple. Phylogenetic Networks that Display a Tree Twice. In BMB, Vol. 76(10):26642679, 2014. Keywords: from rooted trees, normal network, phylogenetic network, phylogeny, reconstruction, tree child network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/CLS14.pdf.
Toggle abstract
"In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the treelike information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratictime algorithm to solve this problem for a class of networks that is more general than treechild networks. © 2014, Society for Mathematical Biology."







Joel Sjöstrand,
Ali Tofigh,
Vincent Daubin,
Lars Arvestad,
Bengt Sennblad and
Jens Lagergren. A Bayesian Method for Analyzing Lateral Gene Transfer. In Systematic Biology, Vol. 63(3):409420, 2014. Keywords: bayesian, duplication, from rooted trees, from sequences, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, Program JPrIMEDLTRS, reconstruction. Note: http://dx.doi.org/10.1093/sysbio/syu007.





Katharina Huber and
Vincent Moulton. Encoding and Constructing 1Nested Phylogenetic Networks with Trinets. In ALG, Vol. 66(3):714738, 2013. Keywords: explicit network, from trinets, phylogenetic network, phylogeny, reconstruction, uniqueness. Note: http://arxiv.org/abs/1110.0728.
Toggle abstract
"Phylogenetic networks are a generalization of phylogenetic trees that are used in biology to represent reticulate or nontreelike evolution. Recently, several algorithms have been developed which aim to construct phylogenetic networks from biological data using triplets, i.e. binary phylogenetic trees on 3element subsets of a given set of species. However, a fundamental problem with this approach is that the triplets displayed by a phylogenetic network do not necessarily uniquely determine or encode the network. Here we propose an alternative approach to encoding and constructing phylogenetic networks, which uses phylogenetic networks on 3element subsets of a set, or trinets, rather than triplets. More specifically, we show that for a special, wellstudied type of phylogenetic network called a 1nested network, the trinets displayed by a 1nested network always encode the network. We also present an efficient algorithm for deciding whether a dense set of trinets (i.e. one that contains a trinet on every 3element subset of a set) can be displayed by a 1nested network or not and, if so, constructs that network. In addition, we discuss some potential new directions that this new approach opens up for constructing and comparing phylogenetic networks. © 2012 Springer Science+Business Media, LLC."



Chris Whidden,
Robert G. Beiko and
Norbert Zeh. FixedParameter Algorithms for Maximum Agreement Forests. In SICOMP, Vol. 42(4):14311466, 2013. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: http://arxiv.org/abs/1108.2664, slides.
Toggle abstract
"We present new and improved fixedparameter algorithms for computing maximum agreement forests of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree pruneandregraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution. Our algorithm for computing maximum acyclic agreement forests is the first depthbounded search algorithm for this problem. Our algorithms substantially outperform the best previous algorithms for these problems. © 2013 Society for Industrial and Applied Mathematics."



Stephen J. Willson. Reconstruction of certain phylogenetic networks from their treeaverage distances. In BMB, Vol. 75(10):18401878, 2013. Keywords: explicit network, from distances, galled tree, normal network, phylogenetic network, phylogeny, unicyclic network. Note: http://www.public.iastate.edu/~swillson/TreeAverageReconPaper9.pdf.
Toggle abstract
"Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the treeaverage distances between the leaves. For a certain class of phylogenetic networks, a polynomialtime method is presented to reconstruct the network from the treeaverage distances. The method is proved to work if there is a single reticulation cycle. © 2013 Society for Mathematical Biology."





Jialiang Yang,
Stefan Grünewald and
XiuFeng Wan. QuartetNet: A Quartet Based Method to Reconstruct Phylogenetic Networks. In MBE, Vol. 30(5):12061217, 2013. Keywords: from quartets, phylogenetic network, phylogeny, Program QuartetNet, reconstruction.
Toggle abstract
"Phylogenetic networks can model reticulate evolutionary events such as hybridization, recombination, and horizontal gene transfer. However, reconstructing such networks is not trivial. Popular characterbased methods are computationally inefficient, whereas distancebased methods cannot guarantee reconstruction accuracy because pairwise genetic distances only reflect partial information about a reticulate phylogeny. To balance accuracy and computational efficiency, here we introduce a quartetbased method to construct a phylogenetic network from a multiple sequence alignment. Unlike distances that only reflect the relationship between a pair of taxa, quartets contain information on the relationships among four taxa; these quartets provide adequate capacity to infer a more accurate phylogenetic network. In applications to simulated and biological data sets, we demonstrate that this novel method is robust and effective in reconstructing reticulate evolutionary events and it has the potential to infer more accurate phylogenetic distances than other conventional phylogenetic network construction methods such as NeighborJoining, NeighborNet, and Split Decomposition. This method can be used in constructing phylogenetic networks from simple evolutionary events involving a few reticulate events to complex evolutionary histories involving a large number of reticulate events. A software called QuartetNet is implemented and available at http://sysbio.cvm.msstate.edu/QuartetNet/. © 2013 The Author."



ThiHau Nguyen,
Vincent Ranwez,
Stéphanie Pointet,
AnneMuriel Chifolleau Arigon,
JeanPhilippe Doyon and
Vincent Berry. Reconciliation and local gene tree rearrangement can be of mutual profit. In ALMOB, Vol. 8(12), 2013. Keywords: duplication, explicit network, from rooted trees, heuristic, lateral gene transfer, phylogenetic network, phylogeny, Program Mowgli, Program MowgliNNI, Program Prunier, reconstruction, software.
Toggle abstract
"Background: Reconciliation methods compare gene trees and species trees to recover evolutionary events such as duplications, transfers and losses explaining the history and composition of genomes. It is wellknown that gene trees inferred from molecular sequences can be partly erroneous due to incorrect sequence alignments as well as phylogenetic reconstruction artifacts such as long branch attraction. In practice, this leads reconciliation methods to overestimate the number of evolutionary events. Several methods have been proposed to circumvent this problem, by collapsing the unsupported edges and then resolving the obtained multifurcating nodes, or by directly rearranging the binary gene trees. Yet these methods have been defined for models of evolution accounting only for duplications and losses, i.e. can not be applied to handle prokaryotic gene families.Results: We propose a reconciliation method accounting for gene duplications, losses and horizontal transfers, that specifically takes into account the uncertainties in gene trees by rearranging their weakly supported edges. Rearrangements are performed on edges having a low confidence value, and are accepted whenever they improve the reconciliation cost. We prove useful properties on the dynamic programming matrix used to compute reconciliations, which allows to speedup the tree space exploration when rearrangements are generated by Nearest Neighbor Interchanges (NNI) edit operations. Experiments on synthetic data show that gene trees modified by such NNI rearrangements are closer to the correct simulated trees and lead to better event predictions on average. Experiments on real data demonstrate that the proposed method leads to a decrease in the reconciliation cost and the number of inferred events. Finally on a dataset of 30 k gene families, this reconciliation method shows a ranking of prokaryotic phyla by transfer rates identical to that proposed by a different approach dedicated to transfer detection [BMCBIOINF 11:324, 2010, PNAS 109(13):49624967, 2012].Conclusions: Prokaryotic gene trees can now be reconciled with their species phylogeny while accounting for the uncertainty of the gene tree. More accurate and more precise reconciliations are obtained with respect to previous parsimony algorithms not accounting for such uncertainties [LNCS 6398:93108, 2010, BIOINF 28(12): i283i291, 2012].A software implementing the method is freely available at http://www.atgcmontpellier.fr/Mowgli/. © 2013 Nguyen et al.; licensee BioMed Central Ltd."



Hoa Vu,
Francis Chin,
WingKai Hon,
Henry Leung,
Kunihiko Sadakane,
WingKin Sung and
SiuMing Yiu. Reconstructing kReticulated Phylogenetic Network from a Set of Gene Trees. In ISBRA13, Vol. 7875:112124 of LNCS, springer, 2013. Keywords: from rooted trees, kreticulated, phylogenetic network, phylogeny, polynomial, Program ARTNET, Program CMPT, reconstruction. Note: http://grid.cs.gsu.edu/~xguo9/publications/2013_Cloud%20computing%20for%20de%20novo%20metagenomic%20sequence%20assembly.pdf#page=123.
Toggle abstract
"The time complexity of existing algorithms for reconstructing a levelx phylogenetic network increases exponentially in x. In this paper, we propose a new classification of phylogenetic networks called kreticulated network. A kreticulated network can model all levelk networks and some levelx networks with x > k. We design algorithms for reconstructing kreticulated network (k = 1 or 2) with minimum number of hybrid nodes from a set of m binary trees, each with n leaves in O(mn 2) time. The implication is that some levelx networks with x > k can now be reconstructed in a faster way. We implemented our algorithm (ARTNET) and compared it with CMPT. We show that ARTNET outperforms CMPT in terms of running time and accuracy. We also consider the case when there does not exist a 2reticulated network for the input trees. We present an algorithm computing a maximum subset of the species set so that a new set of subtrees can be combined into a 2reticulated network. © 2013 SpringerVerlag."



Mukul S. Bansal,
Guy Banay,
Timothy J. Harlow,
J. Peter Gogarten and
Ron Shamir. Systematic inference of highways of horizontal gene transfer in prokaryotes. In BIO, Vol. 29(5):571579, 2013. Keywords: duplication, explicit network, from species tree, from unrooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program HiDe, Program RANGERDTL, reconstruction. Note: http://people.csail.mit.edu/mukul/Bansal_Highways_Bioinformatics_2013.pdf.





Eric Bapteste,
Leo van Iersel,
Axel Janke,
Scott Kelchner,
Steven Kelk,
James O. McInerney,
David A. Morrison,
Luay Nakhleh,
Mike Steel,
Leen Stougie and
James B. Whitfield. Networks: expanding evolutionary thinking. In Trends in Genetics, Vol. 29(8):439441, 2013. Keywords: abstract network, explicit network, phylogenetic network, phylogeny, reconstruction. Note: http://bioinf.nuim.ie/wpcontent/uploads/2013/06/BaptesteTiG2013.pdf.
Toggle abstract
"Networks allow the investigation of evolutionary relationships that do not fit a tree model. They are becoming a leading tool for describing the evolutionary relationships between organisms, given the comparative complexities among genomes. © 2013 Elsevier Ltd."



Yun Yu,
R. Matthew Barnett and
Luay Nakhleh. Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting. In Systematic Biology, Vol. 62(5):738751, 2013. Keywords: from network, from rooted trees, hybridization, lineage sorting, parsimony, phylogenetic network, phylogeny, Program PhyloNet, reconstruction.
Toggle abstract
"Hybridization plays an important evolutionary role in several groups of organisms. A phylogenetic approach to detect hybridization entails sequencing multiple loci across the genomes of a group of species of interest, reconstructing their gene trees, and taking their differences as indicators of hybridization. However, methods that follow this approach mostly ignore population effects, such as incomplete lineage sorting (ILS). Given that hybridization occurs between closely related organisms, ILS may very well be at play and, hence, must be accounted for in the analysis framework. To address this issue, we present a parsimony criterion for reconciling gene trees within the branches of a phylogenetic network, and a local search heuristic for inferring phylogenetic networks from collections of genetree topologies under this criterion. This framework enables phylogenetic analyses while accounting for both hybridization and ILS. Further, we propose two techniques for incorporating information about uncertainty in genetree estimates. Our simulation studies demonstrate the good performance of our framework in terms of identifying the location of hybridization events, as well as estimating the proportions of genes that underwent hybridization. Also, our framework shows good performance in terms of efficiency on handling large data sets in our experiments. Further, in analysing a yeast data set, we demonstrate issues that arise when analysing real data sets. Although a probabilistic approach was recently introduced for this problem, and although parsimonious reconciliations have accuracy issues under certain settings, our parsimony framework provides a much more computationally efficient technique for this type of analysis. Our framework now allows for genomewide scans for hybridization, while also accounting for ILS. [Phylogenetic networks; hybridization; incomplete lineage sorting; coalescent; multilabeled trees.] © 2013 The Author(s). All rights reserved."



Juan Wang,
Maozu Guo,
Xiaoyan Liu,
Yang Liu,
Chunyu Wang,
Linlin Xing and
Kai Che. LNETWORK: An Efficient and Effective Method for Constructing Phylogenetic Networks. In BIO, Vol. 29(18):22692276, 2013. Keywords: explicit network, from rooted trees, phylogenetic network, phylogeny, Program LNetwork, reconstruction, software.
Toggle abstract
"Motivation: The evolutionary history of species is traditionally represented with a rooted phylogenetic tree. Each tree comprises a set of clusters, i.e. subsets of the species that are descended from a common ancestor. When rooted phylogenetic trees are built from several different datasets (e.g. from different genes), the clusters are often conflicting. These conflicting clusters cannot be expressed as a simple phylogenetic tree; however, they can be expressed in a phylogenetic network. Phylogenetic networks are a generalization of phylogenetic trees that can account for processes such as hybridization, horizontal gene transfer and recombination, which are difficult to represent in standard treelike models of evolutionary histories. There is currently a large body of research aimed at developing appropriate methods for constructing phylogenetic networks from cluster sets. The Cass algorithm can construct a much simpler network than other available methods, but is extremely slow for large datasets or for datasets that need lots of reticulate nodes. The networks constructed by Cass are also greatly dependent on the order of input data, i.e. it generally derives different phylogenetic networks for the same dataset when different input orders are used.Results: In this study, we introduce an improved Cass algorithm, Lnetwork, which can construct a phylogenetic network for a given set of clusters. We show that Lnetwork is significantly faster than Cass and effectively weakens the influence of input data order. Moreover, we show that Lnetwork can construct a much simpler network than most of the other available methods. © The Author 2013."



Juan Wang,
Maozu Guo,
Linlin Xing,
Kai Che,
Xiaoyan Liu and
Chunyu Wang. BIMLR: A Method for Constructing Rooted Phylogenetic Networks from Rooted Phylogenetic Trees. In Gene, Vol. 527(1):344351, 2013. Keywords: explicit network, from clusters, from rooted trees, phylogenetic network, phylogeny, Program BIMLR, Program Dendroscope, reconstruction, software.
Toggle abstract
"Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. © 2013 Elsevier B.V."



ZhiZhong Chen and
Lusheng Wang. An Ultrafast Tool for Minimum Reticulate Networks. In JCB, Vol. 20(1):3841, 2013. Keywords: agreement forest, explicit network, from rooted trees, phylogenetic network, phylogeny, Program ultraNet, reconstruction. Note: http://www.cs.cityu.edu.hk/~lwang/research/jcb2013.pdf.
Toggle abstract
"Due to hybridization events in evolution, studying different genes of a set of species may yield two or more related but different phylogenetic trees for the set of species. In this case, we want to combine the trees into a reticulate network with the fewest hybridization events. In this article, we develop a software tool (named UltraNet) for several fundamental problems related to the construction of minimum reticulate networks from two or more phylogenetic trees. Our experimental results show that UltraNet is much faster than all previous tools for these problems. © 2013 Mary Ann Liebert, Inc."



Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):18791890, 2013. Keywords: characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimumtemporalhybridization number for a set P of rooted binary phylogenetic trees is NPhard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimumtemporal hybridization number for two trees is fixedparameter tractable. © 2013 Society for Mathematical Biology."



Celine Scornavacca,
Paprotny Wojciech,
Vincent Berry and
Vincent Ranwez. Representing a set of reconciliations in a compact way. In JBCB, Vol. 11(2):1250025, 2013. Keywords: duplication, explicit network, from network, from rooted trees, from species tree, phylogeny, Program GraphDTL, Program TERA, visualization. Note: http://hallirmm.ccsd.cnrs.fr/lirmm00818801.
Toggle abstract
"Comparative genomic studies are often conducted by reconciliation analyses comparing gene and species trees. One of the issues with reconciliation approaches is that an exponential number of optimal scenarios is possible. The resulting complexity is masked by the fact that a majority of reconciliation software pick up a random optimal solution that is returned to the enduser. However, the alternative solutions should not be ignored since they tell different stories that parsimony considers as viable as the output solution. In this paper, we describe a polynomial space and time algorithm to build a minimum reconciliation grapha graph that summarizes the set of all most parsimonious reconciliations. Amongst numerous applications, it is shown how this graph allows counting the number of nonequivalent most parsimonious reconciliations. © 2013 Imperial College Press."



Luay Nakhleh. Computational approaches to species phylogeny inference and gene tree reconciliation. In Trends in Ecology and Evolution, Vol. 28(12):719728, 2013. Keywords: from rooted trees, from species tree, phylogenetic network, phylogeny, reconstruction, survey. Note: http://bioinfo.cs.rice.edu/sites/bioinfo.cs.rice.edu/files/TREENakhleh13.pdf.
Toggle abstract
"An intricate relation exists between gene trees and species phylogenies, due to evolutionary processes that act on the genes within and across the branches of the species phylogeny. From an analytical perspective, gene trees serve as character states for inferring accurate species phylogenies, and species phylogenies serve as a backdrop against which gene trees are contrasted for elucidating evolutionary processes and parameters. In a 1997 paper, Maddison discussed this relation, reviewed the signatures left by three major evolutionary processes on the gene trees, and surveyed parsimony and likelihood criteria for utilizing these signatures to elucidate computationally this relation. Here, I review progress that has been made in developing computational methods for analyses under these two criteria, and survey remaining challenges. © 2013 Elsevier Ltd."



ThiHau Nguyen,
Vincent Ranwez,
Vincent Berry and
Celine Scornavacca. Support Measures to Estimate the Reliability of Evolutionary Events Predicted by Reconciliation Methods. In PLoS ONE, Vol. 8(10):e73667, 2013. Keywords: duplication, from rooted trees, from species tree, phylogenetic network, phylogeny, polynomial, Program GraphDTL, reconstruction. Note: http://dx.doi.org/10.1371/journal.pone.0073667.
Toggle abstract
"The genome content of extant species is derived from that of ancestral genomes, distorted by evolutionary events such as gene duplications, transfers and losses. Reconciliation methods aim at recovering such events and at localizing them in the species history, by comparing gene family trees to species trees. These methods play an important role in studying genome evolution as well as in inferring orthology relationships. A major issue with reconciliation methods is that the reliability of predicted evolutionary events may be questioned for various reasons: Firstly, there may be multiple equally optimal reconciliations for a given species treegene tree pair. Secondly, reconciliation methods can be misled by inaccurate gene or species trees. Thirdly, predicted events may fluctuate with method parameters such as the cost or rate of elementary events. For all of these reasons, confidence values for predicted evolutionary events are sorely needed. It was recently suggested that the frequency of each event in the set of all optimal reconciliations could be used as a support measure. We put this proposition to the test here and also consider a variant where the support measure is obtained by additionally accounting for suboptimal reconciliations. Experiments on simulated data show the relevance of event supports computed by both methods, while resorting to suboptimal sampling was shown to be more effective. Unfortunately, we also show that, unlike the majorityrule consensus tree for phylogenies, there is no guarantee that a single reconciliation can contain all events having above 50% support. In this paper, we detail how to rely on the reconciliation graph to efficiently identify the median reconciliation. Such median reconciliation can be found in polynomial time within the potentially exponential set of most parsimonious reconciliations. © 2013 Nguyen et al."



Mukul S. Bansal,
Eric J. Alm and
Manolis Kellis. Reconciliation Revisited: Handling Multiple Optima when Reconciling with Duplication, Transfer, and Loss. In JCB, Vol. 20(10):738754, 2013. Keywords: duplication, from rooted trees, from species tree, loss, phylogenetic network, phylogeny, Program RANGERDTL, reconstruction. Note: http://www.engr.uconn.edu/~mukul/Bansal_JCB2013.pdf.
Toggle abstract
"Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While duplicationloss (DL) reconciliation leads to a unique maximumparsimony solution, duplicationtransferloss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult to infer the true evolutionary history of the gene family. This problem is further exacerbated by the fact that different event cost assignments yield different sets of optimal reconciliations. Here, we present an effective, efficient, and scalable method for dealing with these fundamental problems in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We show that even gene trees with only a few dozen genes often have millions of optimal reconciliations and present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn 2) time per sample, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mappings and event assignments and to investigate the impact of varying event costs. We apply our method to a biological dataset of approximately 4700 gene trees from 100 taxa and observe that 93% of event assignments and 73% of mappings remain consistent across different multiple optima. Our analysis represents the first systematic investigation of the space of optimal DTL reconciliations and has many important implications for the study of gene family evolution. © 2013 Mary Ann Liebert, Inc."



Alberto Apostolico,
Matteo Comin,
Andreas W. M. Dress and
Laxmi Parida. Ultrametric networks: a new tool for phylogenetic analysis. In Algorithms for Molecular Biology, Vol. 8(7):110, 2013. Keywords: abstract network, from distances, phylogenetic network, phylogeny, Program Ultranet. Note: http://dx.doi.org/10.1186/1748718887.
Toggle abstract
"Background: The large majority of optimization problems related to the inference of distancebased trees used in phylogenetic analysis and classification is known to be intractable. One noted exception is found within the realm of ultrametric distances. The introduction of ultrametric trees in phylogeny was inspired by a model of evolution driven by the postulate of a molecular clock, now dismissed, whereby phylogeny could be represented by a weighted tree in which the sum of the weights of the edges separating any given leaf from the root is the same for all leaves. Both, molecular clocks and rooted ultrametric trees, fell out of fashion as credible representations of evolutionary change. At the same time, ultrametric dendrograms have shown good potential for purposes of classification in so far as they have proven to provide good approximations for additive trees. Most of these approximations are still intractable, but the problem of finding the nearest ultrametric distance matrix to a given distance matrix with respect to the L∞ distance has been long known to be solvable in polynomial time, the solution being incarnated in any minimum spanning tree for the weighted graph subtending to the matrix.Results: This paper expands this subdominant ultrametric perspective by studying ultrametric networks, consisting of the collection of all edges involved in some minimum spanning tree. It is shown that, for a graph with n vertices, the construction of such a network can be carried out by a simple algorithm in optimal time O(n2) which is faster by a factor of n than the direct adaptation of the classical O(n3) paradigm by Warshall for computing the transitive closure of a graph. This algorithm, called UltraNet, will be shown to be easily adapted to compute relaxed networks and to support the introduction of artificial points to reduce the maximum distance between vertices in a pair. Finally, a few experiments will be discussed to demonstrate the applicability of subdominant ultrametric networks.Availability: http://www.dei.unipd.it/~ciompin/main/Ultranet/Ultranet.html. © 2013 Apostolico et al.; licensee BioMed Central Ltd."



Mehdi Layeghifard,
Pedro R. PeresNeto and
Vladimir Makarenkov. Inferring explicit weighted consensus networks to represent alternative evolutionary histories. In BMCEB, Vol. 13(274):125, 2013. Keywords: explicit network, from rooted trees, from species tree, phylogenetic network, phylogeny, Program ConsensusNetwork, reconstruction. Note: http://dx.doi.org/10.1186/1471214813274.
Toggle abstract
"Background: The advent of molecular biology techniques and constant increase in availability of genetic material have triggered the development of many phylogenetic tree inference methods. However, several reticulate evolution processes, such as horizontal gene transfer and hybridization, have been shown to blur the species evolutionary history by causing discordance among phylogenies inferred from different genes. Methods. To tackle this problem, we hereby describe a new method for inferring and representing alternative (reticulate) evolutionary histories of species as an explicit weighted consensus network which can be constructed from a collection of gene trees with or without prior knowledge of the species phylogeny. Results: We provide a way of building a weighted phylogenetic network for each of the following reticulation mechanisms: diploid hybridization, intragenic recombination and complete or partial horizontal gene transfer. We successfully tested our method on some synthetic and real datasets to infer the abovementioned evolutionary events which may have influenced the evolution of many species. Conclusions: Our weighted consensus network inference method allows one to infer, visualize and validate statistically major conflicting signals induced by the mechanisms of reticulate evolution. The results provided by the new method can be used to represent the inferred conflicting signals by means of explicit and easytointerpret phylogenetic networks. © 2013 Layeghifard et al.; licensee BioMed Central Ltd."



Gergely J. Szöllösi,
Eric Tannier,
Nicolas Lartillot and
Vincent Daubin. Lateral Gene Transfer from the Dead. In Systematic Biology, Vol. 62(3):386397, 2013. Keywords: duplication, lateral gene transfer, likelihood, loss, phylogeny, Program TERA, reconstruction. Note: http://dx.doi.org/10.1093/sysbio/syt003.
Toggle abstract
"In phylogenetic studies, the evolution of molecular sequences is assumed to have taken place along the phylogeny traced by the ancestors of extant species. In the presence of lateral gene transfer, however, this may not be the case, because the species lineage from which a gene was transferred may have gone extinct or not have been sampled. Because it is not feasible to specify or reconstruct the complete phylogeny of all species, we must describe the evolution of genes outside the represented phylogeny by modeling the speciation dynamics that gave rise to the complete phylogeny. We demonstrate that if the number of sampled species is small compared with the total number of existing species, the overwhelming majority of gene transfers involve speciation to and evolution along extinct or unsampled lineages. We show that the evolution of genes along extinct or unsampled lineages can to good approximation be treated as those of independently evolving lineages described by a few global parameters. Using this result, we derive an algorithm to calculate the probability of a gene tree and recover the maximumlikelihood reconciliation given the phylogeny of the sampled species. Examining 473 nearuniversal gene families from 36 cyanobacteria, we find that nearly a third of transfer events (28%) appear to have topological signatures of evolution along extinct species, but only approximately 6% of transfers trace their ancestry to before the common ancestor of the sampled cyanobacteria. © 2013 The Author(s)."



Gergely J. Szöllösi,
Wojciech Rosikiewicz,
Bastien Boussau,
Eric Tannier and
Vincent Daubin. Efficient Exploration of the Space of Reconciled Gene Trees. In Systematic Biology, Vol. 62(6):901912, 2013. Keywords: duplication, explicit network, lateral gene transfer, likelihood, loss, phylogeny, Program ALE, reconstruction. Note: http://arxiv.org/abs/1306.2167.
Toggle abstract
"Gene trees record the combination of genelevel events, such as duplication, transfer and loss (DTL), and specieslevel events, such as speciation and extinction. Gene treespecies tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and specieslevel events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene treespecies tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllo{double acute}si et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihoodbased inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. © The Author(s) 2013."





Philippe Gambette and
Katharina Huber. On Encodings of Phylogenetic Networks of Bounded Level. In JOMB, Vol. 65(1):157180, 2012. Keywords: characterization, explicit network, from clusters, from rooted trees, from triplets, galled tree, identifiability, level k phylogenetic network, phylogenetic network, uniqueness, weak hierarchy. Note: http://hal.archivesouvertes.fr/hal00609130/en/.
Toggle abstract
"Phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the wellstudied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i. e. uniquely describe) the network that induces it? For the large class of level1 (phylogenetic) networks we characterize those level1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. In addition, we show that three known distance measures for comparing phylogenetic networks are in fact metrics on the resulting subclass and give the diameter for two of them. Finally, we investigate the related concept of indistinguishability and also show that many properties enjoyed by level1 networks are not satisfied by networks of higher level. © 2011 SpringerVerlag."



Stephen J. Willson. CSD Homomorphisms Between Phylogenetic Networks. In TCBB, Vol. 9(4), 2012. Keywords: explicit network, from network, from quartets, phylogenetic network. Note: http://www.public.iastate.edu/~swillson/Relationships11IEEE.pdf, preliminary version entitled Relationships Among Phylogenetic Networks.
Toggle abstract
"Since Darwin, species trees have been used as a simplified description of the relationships which summarize the complicated network N of reality. Recent evidence of hybridization and lateral gene transfer, however, suggest that there are situations where trees are inadequate. Consequently it is important to determine properties that characterize networks closely related to N and possibly more complicated than trees but lacking the full complexity of N. A connected surjective digraph map (CSD) is a map f from one network N to another network M such that every arc is either collapsed to a single vertex or is taken to an arc, such that f is surjective, and such that the inverse image of a vertex is always connected. CSD maps are shown to behave well under composition. It is proved that if there is a CSD map from N to M, then there is a way to lift an undirected version of M into N, often with added resolution. A CSD map from N to M puts strong constraints on N. In general, it may be useful to study classes of networks such that, for any N, there exists a CSD map from N to some standard member of that class. © 2012 IEEE."



Steven Kelk,
Celine Scornavacca and
Leo van Iersel. On the elusiveness of clusters. In TCBB, Vol. 9(2):517534, 2012. Keywords: explicit network, from clusters, from rooted trees, from triplets, level k phylogenetic network, phylogenetic network, phylogeny, Program Clustistic, reconstruction, software. Note: http://arxiv.org/abs/1103.1834.





Jeremy G. Sumner,
Barbara R. Holland and
Peter D. Jarvis. The algebra of the general Markov model on phylogenetic trees and networks. In BMB, Vol. 74(4):858880, 2012. Keywords: abstract network, phylogenetic network, phylogeny, split, split network, statistical model. Note: http://arxiv.org/abs/1012.5165.
Toggle abstract
"It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuoustime Markov chain together with the "splitting" operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications. © 2011 Society for Mathematical Biology."



Andreas Spillner,
Binh T. Nguyen and
Vincent Moulton. Constructing and Drawing Regular Planar Split Networks. In TCBB, Vol. 9(2):395407, 2012. Keywords: abstract network, from splits, phylogenetic network, phylogeny, reconstruction, visualization. Note: slides and presentation available at http://www.newton.ac.uk/programmes/PLG/seminars/062111501.html.
Toggle abstract
"Split networks are commonly used to visualize collections of bipartitions, also called splits, of a finite set. Such collections arise, for example, in evolutionary studies. Split networks can be viewed as a generalization of phylogenetic trees and may be generated using the SplitsTree package. Recently, the NeighborNet method for generating split networks has become rather popular, in part because it is guaranteed to always generate a circular split system, which can always be displayed by a planar split network. Even so, labels must be placed on the "outside" of the network, which might be problematic in some applications. To help circumvent this problem, it can be helpful to consider socalled flat split systems, which can be displayed by planar split networks where labels are allowed on the inside of the network too. Here, we present a new algorithm that is guaranteed to compute a minimal planar split network displaying a flat split system in polynomial time, provided the split system is given in a certain format. We will also briefly discuss two heuristics that could be useful for analyzing phylogeographic data and that allow the computation of flat split systems in this format in polynomial time. © 2006 IEEE."



Paul Phipps and
Sergey Bereg. Optimizing Phylogenetic Networks for Circular Split Systems. In TCBB, Vol. 9(2):535547, 2012. Keywords: abstract network, from distances, from splits, phylogenetic network, phylogeny, Program PhippsNetwork, reconstruction, software.
Toggle abstract
"We address the problem of realizing a given distance matrix by a planar phylogenetic network with a minimum number of faces. With the help of the popular software SplitsTree4, we start by approximating the distance matrix with a distance metric that is a linear combination of circular splits. The main results of this paper are the necessary and sufficient conditions for the existence of a network with a single face. We show how such a network can be constructed, and we present a heuristic for constructing a network with few faces using the first algorithm as the base case. Experimental results on biological data show that this heuristic algorithm can produce phylogenetic networks with far fewer faces than the ones computed by SplitsTree4, without affecting the approximation of the distance matrix. © 2012 IEEE."



Magnus Bordewich and
Charles Semple. Budgeted Nature Reserve Selection with diversity feature loss and arbitrary split systems. In JOMB, Vol. 64(1):6985, 2012. Keywords: abstract network, approximation, diversity, phylogenetic network, polynomial, split network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS11.pdf.
Toggle abstract
"Arising in the context of biodiversity conservation, the Budgeted Nature Reserve Selection (BNRS) problem is to select, subject to budgetary constraints, a set of regions to conserve so that the phylogenetic diversity (PD) of the set of species contained within those regions is maximized. Here PD is measured across either a single rooted tree or a single unrooted tree. Nevertheless, in both settings, this problem is NPhard. However, it was recently shown that, for each setting, there is a polynomialtime (11/e)approximation algorithm for it and that this algorithm is tight. In the first part of the paper, we consider two extensions of BNRS. In the rooted setting we additionally allow for the disappearance of features, for varying survival probabilities across species, and for PD to be measured across multiple trees. In the unrooted setting, we extend to arbitrary split systems. We show that, despite these additional allowances, there remains a polynomialtime (11/e)approximation algorithm for each extension. In the second part of the paper, we resolve a complexity problem on computing PD across an arbitrary split system left open by Spillner et al. © 2011 SpringerVerlag."



Simon Joly. JML: Testing hybridization from species trees. In Molecular Ecology Ressources, Vol. 12(1):179184, 2012. Keywords: from species tree, hybridization, lineage sorting, phylogenetic network, phylogeny, Program JML, statistical model. Note: http://www.plantevolution.org/pdf/JMLpaper_accepted.pdf.
Toggle abstract
"I introduce the software jml that tests for the presence of hybridization in multispecies sequence data sets by posterior predictive checking following Joly, McLenachan and Lockhart (2009, American Naturalist e54). Although their method could potentially be applied on any data set, the lack of appropriate software made its application difficult. The software jml thus fills a need for an easy application of the method but also includes improvements such as the possibility to incorporate uncertainty in the species tree topology. The jml software uses a posterior distribution of species trees, population sizes and branch lengths to simulate replicate sequence data sets using the coalescent with no migration. A test quantity, defined as the minimum pairwise sequence distance between sequences of two species, is then evaluated on the simulated data sets and compared to the one estimated from the original data. Because the test quantity is a good predictor of hybridization events, departure from the bifurcating species tree model could be interpreted as evidence of hybridization. Software performance in terms of computing time is evaluated for several parameters. I also show an application example of the software for detecting hybridization among native diploid North American roses. © 2011 Blackwell Publishing Ltd."



ZhiZhong Chen and
Lusheng Wang. Algorithms for Reticulate Networks of Multiple Phylogenetic Trees. In TCBB, Vol. 9(2):372384, 2012. Keywords: explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program CMPT, Program MaafB, reconstruction, software. Note: http://rnc.r.dendai.ac.jp/~chen/papers/rMaaf.pdf.
Toggle abstract
"A reticulate network N of multiple phylogenetic trees may have nodes with two or more parents (called reticulation nodes). There are two ways to define the reticulation number of N. One way is to define it as the number of reticulation nodes in N in this case, a reticulate network with the smallest reticulation number is called an optimal typeI reticulate network of the trees. The better way is to define it as the total number of parents of reticulation nodes in N minus the number of reticulation nodes in N ; in this case, a reticulate network with the smallest reticulation number is called an optimal typeII reticulate network of the trees. In this paper, we first present a fast fixedparameter algorithm for constructing one or all optimal typeI reticulate networks of multiple phylogenetic trees. We then use the algorithm together with other ideas to obtain an algorithm for estimating a lower bound on the reticulation number of an optimal typeII reticulate network of the input trees. To our knowledge, these are the first fixedparameter algorithms for the problems. We have implemented the algorithms in ANSI C, obtaining programs CMPT and MaafB. Our experimental data show that CMPT can construct optimal typeI reticulate networks rapidly and MaafB can compute better lower bounds for optimal typeII reticulate networks within shorter time than the previously best program PIRN designed by Wu. © 2006 IEEE."



Stephen J. Willson. Treeaverage distances on certain phylogenetic networks have their weights uniquely determined. In ALMOB, Vol. 7(13), 2012. Keywords: from distances, from network, normal network, phylogenetic network, phylogeny, reconstruction, tree child network. Note: hhttp://www.public.iastate.edu/~swillson/TreeAverageDis10All.pdf.
Toggle abstract
"A phylogenetic network N has vertices corresponding to species and arcs corresponding to direct genetic inheritance from the species at the tail to the species at the head. Measurements of DNA are often made on species in the leaf set, and one seeks to infer properties of the network, possibly including the graph itself. In the case of phylogenetic trees, distances between extant species are frequently used to infer the phylogenetic trees by methods such as neighborjoining.This paper proposes a treeaverage distance for networks more general than trees. The notion requires a weight on each arc measuring the genetic change along the arc. For each displayed tree the distance between two leaves is the sum of the weights along the path joining them. At a hybrid vertex, each character is inherited from one of its parents. We will assume that for each hybrid there is a probability that the inheritance of a character is from a specified parent. Assume that the inheritance events at different hybrids are independent. Then for each displayed tree there will be a probability that the inheritance of a given character follows the tree; this probability may be interpreted as the probability of the tree. The treeaverage distance between the leaves is defined to be the expected value of their distance in the displayed trees.For a class of rooted networks that includes rooted trees, it is shown that the weights and the probabilities at each hybrid vertex can be calculated given the network and the treeaverage distances between the leaves. Hence these weights and probabilities are uniquely determined. The hypotheses on the networks include that hybrid vertices have indegree exactly 2 and that vertices that are not leaves have a treechild. © 2012 Willson; licensee BioMed Central Ltd."



Benjamin Albrecht,
Celine Scornavacca,
Alberto Cenci and
Daniel H. Huson. Fast computation of minimum hybridization networks. In BIO, Vol. 28(2):191197, 2012. Keywords: explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program Dendroscope, Program Hybroscale, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btr618.
Toggle abstract
"Motivation: Hybridization events in evolution may lead to incongruent gene trees. One approach to determining possible interspecific hybridization events is to compute a hybridization network that attempts to reconcile incongruent gene trees using a minimum number of hybridization events. Results: We describe how to compute a representative set of minimum hybridization networks for two given bifurcating input trees, using a parallel algorithm and provide a userfriendly implementation. A simulation study suggests that our program performs significantly better than existing software on biologically relevant data. Finally, we demonstrate the application of such methods in the context of the evolution of the Aegilops/Triticum genera. Availability and implementation: The algorithm is implemented in the program Dendroscope 3, which is freely available from www.dendroscope.org and runs on all three major operating systems. © The Author 2011. Published by Oxford University Press. All rights reserved."



Steven Kelk,
Leo van Iersel,
Nela Lekic,
Simone Linz,
Celine Scornavacca and
Leen Stougie. Cycle killer... qu'estce que c'est? On the comparative approximability of hybridization number and directed feedback vertex set. In SIDMA, Vol. 26(4):16351656, 2012. Keywords: agreement forest, approximation, explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program CycleKiller, reconstruction. Note: http://arxiv.org/abs/1112.5359, about the title.
Toggle abstract
"We show that the problem of computing the hybridization number of two rooted binary phylogenetic trees on the same set of taxa X has a constant factor polynomialtime approximation if and only if the problem of computing a minimumsize feedback vertex set in a directed graph (DFVS) has a constant factor polynomialtime approximation. The latter problem, which asks for a minimum number of vertices to be removed from a directed graph to transform it into a directed acyclic graph, is one of the problems in Karp's seminal 1972 list of 21 NPcomplete problems. Despite considerable attention from the combinatorial optimization community, it remains to this day unknown whether a constant factor polynomialtime approximation exists for DFVS. Our result thus places the (in)approximability of hybridization number in a much broader complexity context, and as a consequence we obtain that it inherits inapproximability results from the problem Vertex Cover. On the positive side, we use results from the DFVS literature to give an O(log r log log r) approximation for the hybridization number where r is the correct value. Copyright © by SIAM."



Rosalba Radice. A Bayesian Approach to Modelling Reticulation Events with Application to the Ribosomal Protein Gene rps11 of Flowering Plants. In Australian & New Zealand Journal of Statistics, Vol. 54(4):401426, 2012. Keywords: bayesian, phylogenetic network, phylogeny, reconstruction, statistical model.
Toggle abstract
"Traditional phylogenetic inference assumes that the history of a set of taxa can be explained by a tree. This assumption is often violated as some biological entities can exchange genetic material giving rise to nontreelike events often called reticulations. Failure to consider these events might result in incorrectly inferred phylogenies. Phylogenetic networks provide a flexible tool which allows researchers to model the evolutionary history of a set of organisms in the presence of reticulation events. In recent years, a number of methods addressing phylogenetic network parameter estimation have been introduced. Some of them are based on the idea that a phylogenetic network can be defined as a directed acyclic graph. Based on this definition, we propose a Bayesian approach to the estimation of phylogenetic network parameters which allows for different phylogenies to be inferred at different parts of a multiple DNA alignment. The algorithm is tested on simulated data and applied to the ribosomal protein gene rps11 data from five flowering plants, where reticulation events are suspected to be present. The proposed approach can be applied to a wide variety of problems which aim at exploring the possibility of reticulation events in the history of a set of taxa. © 2012 Australian Statistical Publishing Association Inc. Published by Wiley Publishing Asia Pty Ltd."



Philippe Gambette,
Vincent Berry and
Christophe Paul. Quartets and Unrooted Phylogenetic Networks. In JBCB, Vol. 10(4):1250004, 2012. Keywords: abstract network, circular split system, explicit network, from quartets, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction, split, split network. Note: http://hal.archivesouvertes.fr/hal00678046/en/.
Toggle abstract
"Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of levelk networks. In particular, we give an equivalence theorem between circular split systems and unrooted level1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted levelk phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions. © 2012 Imperial College Press."



Yun Yu,
James H. Degnan and
Luay Nakhleh. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. In PLoS Genetics, Vol. 8(4):e1002660, 2012. Keywords: AIC, BIC, explicit network, hybridization, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1371/journal.pgen.1002660.
Toggle abstract
"Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. © 2012 Yu et al."



Reza Hassanzadeh,
Changiz Eslahchi and
WingKin Sung. Constructing phylogenetic supernetworks based on simulated annealing. In MPE, Vol. 63(3):738744, 2012. Keywords: abstract network, from unrooted trees, heuristic, phylogenetic network, phylogeny, Program SNSA, reconstruction, simulated annealing, software, split network. Note: http://dx.doi.org/10.1016/j.ympev.2012.02.009.
Toggle abstract
Different partial phylogenetic trees can be derived from different sources of evidence and different methods. One important problem is to summarize these partial phylogenetic trees using a supernetwork. We propose a novel simulated annealing based method called SNSA which uses an optimization function to produce a simple network that still retains a great deal of phylogenetic information. We report the performance of this new method on real and simulated datasets. © 2012 Elsevier Inc.





Jesper Jansson and
Andrzej Lingas. Computing the rooted triplet distance between galled trees by counting triangles. In CPM12, Vol. 7354:385398 of LNCS, springer, 2012. Keywords: distance between networks, explicit network, from network, galled tree, phylogenetic network, phylogeny, polynomial, triplet distance. Note: http://www.df.lth.se/~jj/Publications/d_rt_for_Galled_Trees5_CPM_2012.pdf.
Toggle abstract
"We consider a generalization of the rooted triplet distance between two phylogenetic trees to two phylogenetic networks. We show that if each of the two given phylogenetic networks is a socalled galled tree with n leaves then the rooted triplet distance can be computed in o(n 2.688) time. Our upper bound is obtained by reducing the problem of computing the rooted triplet distance to that of counting monochromatic and almost monochromatic triangles in an undirected, edgecolored graph. To count different types of colored triangles in a graph efficiently, we extend an existing technique based on matrix multiplication and obtain several new related results that may be of independent interest. © 2012 SpringerVerlag."



Tetsuo Asano,
Jesper Jansson,
Kunihiko Sadakane,
Ryuhei Uehara and
Gabriel Valiente. Faster computation of the Robinson–Foulds distance between phylogenetic networks. In Information Sciences, Vol. 197:7790, 2012. Keywords: distance between networks, explicit network, level k phylogenetic network, phylogenetic network, polynomial, spread.
Toggle abstract
"The RobinsonFoulds distance, a widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two phylogenetic networks N 1, N 2 with n leaf labels and at most m nodes and e edges each, the RobinsonFoulds distance measures the number of clusters of descendant leaves not shared by N 1 and N 2. The fastest known algorithm for computing the RobinsonFoulds distance between N 1 and N 2 runs in O(me) time. In this paper, we improve the time complexity to O(ne/log n) for general phylogenetic networks and O(nm/log n) for general phylogenetic networks with bounded degree (assuming the word RAM model with a word length of ⌈logn⌉ bits), and to optimal O(m) time for leafouterplanar networks as well as optimal O(n) time for level1 phylogenetic networks (that is, galledtrees). We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a levelk network is at most k + 1, which implies that for one level1 and one levelk phylogenetic network, our algorithm runs in O((k + 1)e) time. © 2012 Elsevier Inc. All rights reserved."



Lavanya Kannan and
Ward C Wheeler. Maximum Parsimony on Phylogenetic Networks. In ALMOB, Vol. 7:9, 2012. Keywords: dynamic programming, explicit network, from sequences, heuristic, parsimony, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1186/1748718879.
Toggle abstract
"Background: Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a characterbased approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.Results: In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain wellknown algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.Conclusion: The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an inbuilt cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network. © 2012 Kannan and Wheeler; licensee BioMed Central Ltd."



Alix Boc,
Alpha B. Diallo and
Vladimir Makarenkov. TREX: a web server for inferring, validating and visualizing phylogenetic trees and networks. In NAR, Vol. 40(W1):W573W579, 2012. Keywords: from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, Program T REX, reconstruction, reticulogram, software. Note: http://dx.doi.org/10.1093/nar/gks485.
Toggle abstract
"TREX (Tree and reticulogram REConstruction) is a web server dedicated to the reconstruction of phylogenetic trees, reticulation networks and to the inference of horizontal gene transfer (HGT) events. TREX includes several popular bioinformatics applications such as MUSCLE, MAFFT, Neighbor Joining, NINJA, BioNJ, PhyML, RAxML, random phylogenetic tree generator and some wellknown sequencetodistance transformation models. It also comprises fast and effective methods for inferring phylogenetic trees from complete and incomplete distance matrices as well as for reconstructing reticulograms and HGT networks, including the detection and validation of complete and partial gene transfers, inference of consensus HGT scenarios and interactive HGT identification, developed by the authors. The included methods allows for validating and visualizing phylogenetic trees and networks which can be built from distance or sequence data. The web server is available at: www.trex.uqam.ca. © 2012 The Author(s)."



Daniel H. Huson and
Celine Scornavacca. Dendroscope 3: An Interactive Tool for Rooted Phylogenetic Trees and Networks. In Systematic Biology, Vol. 61(6):10611067, 2012. Keywords: from rooted trees, from triplets, phylogenetic network, phylogeny, Program Dendroscope, reconstruction, software, visualization.
Toggle abstract
"Dendroscope 3 is a new program for working with rooted phylogenetic trees and networks. It provides a number of methods for drawing and comparing rooted phylogenetic networks, and for computing them from rooted trees. The program can be used interactively or in commandline mode. The program is written in Java, use of the software is free, and installers for all 3 major operating systems can be downloaded from www.dendroscope.org. [Phylogenetic trees; phylogenetic networks; software.] © 2012 The Author(s)."



ZhiZhong Chen,
Lusheng Wang and
Satoshi Yamanaka. A fast tool for minimum hybridization networks. In BMCB, Vol. 13:155, 2012. Keywords: agreement forest, explicit network, from rooted trees, phylogenetic network, phylogeny, Program FastHN, reconstruction, software. Note: http://dx.doi.org/10.1186/1471210513155.
Toggle abstract
"Background: Due to hybridization events in evolution, studying two different genes of a set of species may yield two related but different phylogenetic trees for the set of species. In this case, we want to combine the two phylogenetic trees into a hybridization network with the fewest hybridization events. This leads to three computational problems, namely, the problem of computing the minimum size of a hybridization network, the problem of constructing one minimum hybridization network, and the problem of enumerating a representative set of minimum hybridization networks. The previously best software tools for these problems (namely, Chen and Wang's HybridNet and Albrecht et al.'s Dendroscope 3) run very slowly for large instances that cannot be reduced to relatively small instances. Indeed, when the minimum size of a hybridization network of two given trees is larger than 23 and the problem for the trees cannot be reduced to relatively smaller independent subproblems, then HybridNet almost always takes longer than 1 day and Dendroscope 3 often fails to complete. Thus, a faster software tool for the problems is in need.Results: We develop a software tool in ANSI C, named FastHN, for the following problems: Computing the minimum size of a hybridization network, constructing one minimum hybridization network, and enumerating a representative set of minimum hybridization networks. We obtain FastHN by refining HybridNet with three ideas. The first idea is to preprocess the input trees so that the trees become smaller or the problem becomes to solve two or more relatively smaller independent subproblems. The second idea is to use a fast algorithm for computing the rSPR distance of two given phylognetic trees to cut more branches of the search tree in the exhaustivesearch stage of the algorithm. The third idea is that during the exhaustivesearch stage of the algorithm, we find two sibling leaves in one of the two forests (obtained from the given trees by cutting some edges) such that they are as far as possible in the other forest. As the result, FastHN always runs much faster than HybridNet. Unlike Dendroscope 3, FastHN is a singlethreaded program. Despite this disadvantage, our experimental data shows that FastHN runs substantially faster than the multithreaded Dendroscope 3 on a PC with multiple cores. Indeed, FastHN can finish within 16 minutes (on average on a Windows7 (x64) desktop PC with i72600 CPU) even if the minimum size of a hybridization network of two given trees is about 25, the trees each have 100 leaves, and the problem for the input trees cannot be reduced to two or more independent subproblems via cluster reductions. It is also worth mentioning that like HybridNet, FastHN does not use much memory (indeed, the amount of memory is at most quadratic in the input size). In contrast, Dendroscope 3 uses a huge amount of memory. Executables of FastHN for Windows XP (x86), Windows 7 (x64), Linux, and Mac OS are available (see the Results and discussion section for details).Conclusions: For both biological datasets and simulated datasets, our experimental results show that FastHN runs substantially faster than HybridNet and Dendroscope 3. The superiority of FastHN in speed over the previous tools becomes more significant as the hybridization number becomes larger. In addition, FastHN uses much less memory than Dendroscope 3 and uses the same amount of memory as HybridNet. © 2012 Chen et al.; licensee BioMed Central Ltd."



Michel Habib and
ThuHien To. Constructing a Minimum Phylogenetic Network from a Dense Triplet Set. In JBCB, Vol. 10(5):1250013, 2012. Keywords: explicit network, from triplets, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1103.2266.
Toggle abstract
"For a given set L of species and a set T of triplets on L, we seek to construct a phylogenetic network which is consistent with T i.e. which represents all triplets of T. The level of a network is defined as the maximum number of hybrid vertices in its biconnected components. When T is dense, there exist polynomial time algorithms to construct level0,1 and 2 networks (Aho et al., 1981; Jansson, Nguyen and Sung, 2006; Jansson and Sung, 2006; Iersel et al., 2009). For higher levels, partial answers were obtained in the paper by Iersel and Kelk (2008), with a polynomial time algorithm for simple networks. In this paper, we detail the first complete answer for the general case, solving a problem proposed in Jansson and Sung (2006) and Iersel et al. (2009). For any k fixed, it is possible to construct a levelk network having the minimum number of hybrid vertices and consistent with T, if there is any, in time O(T k+1 n⌊4k/3⌋+1). © 2012 Imperial College Press."



Ruogu Sheng and
Sergey Bereg. Approximating Metrics with Planar BoundaryLabeled Phylogenetic Networks. In JBCB, Vol. 10(6):1250017, 2012. Keywords: abstract network, from distances, phylogenetic network, phylogeny, reconstruction.
Toggle abstract
"Phylogenetic networks are useful for visualizing evolutionary relationships between species with reticulate events such as hybridizations and horizontal gene transfers. In this paper, we consider the problem of constructing undirected phylogenetic networks that (1) are planar graphs and (2) admit embeddings in the plane where the vertices labeling all taxa are on the boundary of the network. We develop a new algorithm for constructing phylogenetic networks satisfying these constraints. First, we show that only approximate networks can be constructed for some distance matrices with at least five taxa. Then we prove that any fivepoint metric can be represented approximately by a planar boundarylabeled network with guaranteed fit value of 94.79. We extend the networks constructed in the proof to design an algorithm for computing planar boundarylabeled networks for any number of taxa. © 2012 Imperial College Press."



Joseph K. Pickrell and
Jonathan K. Pritchard. Inference of Population Splits and Mixtures from GenomeWide Allele Frequency Data. In PLoS Genetics, Vol. 8(11):e1002967, 2012. Keywords: explicit network, heuristic, likelihood, phylogenetic network, phylogeny, population genetics, Program TreeMix. Note: http://dx.doi.org/10.1371/journal.pgen.1002967.
Toggle abstract
"Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genomewide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com. © 2012 Pickrell, Pritchard."



Nick J. Patterson,
Priya Moorjani,
Yontao Luo,
Swapan Mallick,
Nadin Rohland,
Yiping Zhan,
Teri Genschoreck,
Teresa Webster and
David Reich. Ancient Admixture in Human History. In Genetics, Vol. 192(3):10651093, 2012. Keywords: explicit network, phylogenetic network, phylogeny, population genetics, Program AdmixTools. Note: http://genetics.med.harvard.edu/reich/Reich_Lab/Welcome_files/2012_Patterson_AncientAdmixture_Genetics.pdf.
Toggle abstract
"Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples that provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to presentday Basques and Sardinians and the other related to presentday populations of northeast Asia and the Americas. This likely reflects a history of admixture between Neolithic migrants and the indigenous Mesolithic population of Europe, consistent with recent analyses of ancient bones from Sweden and the sequencing of the genome of the Tyrolean "Iceman." © 2012 by the Genetics Society of America."



Katharina Huber,
Vincent Moulton,
Andreas Spillner,
Sabine Storandt and
Radoslaw Suchecki. Computing a consensus of multilabeled trees. In ALENEX12, Pages 8492, 2012. Keywords: duplication, explicit network, exponential algorithm, phylogenetic network, phylogeny. Note: http://siam.omnibooksonline.com/2012ALENEX/data/papers/020.pdf.
Toggle abstract
In this paper we consider two challenging problems that arise in the context of computing a consensus of a collection of multilabeled trees, namely (1) selecting a compatible collection of clusters on a multiset from an ordered list of such clusters and (2) optimally refining high degree vertices in a multilabeled tree. Forming such a consensus is part of an approach to reconstruct the evolutionary history of a set of species for which events such as genome duplication and hybridization have occurred in the past. We present exact algorithms for solving (1) and (2) that have an exponential runtime in the worst case. To give some impression of their performance in practice, we apply them to simulated input and to a real biological data set highlighting the impact of several structural properties of the input on the performance.



