
Katharina Huber,
Leo van Iersel,
Vincent Moulton,
Celine Scornavacca and
Taoyang Wu. Reconstructing phylogenetic level1 networks from nondense binet and trinet sets. In ALG, Vol. 77(1):173200, 2017. Keywords: explicit network, FPT, from binets, from trinets, NP complete, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1411.6804.





Leo van Iersel,
Steven Kelk,
Giorgios Stamoulis,
Leen Stougie and
Olivier Boes. On unrooted and rootuncertain variants of several wellknown phylogenetic network problems. In ALG, 2017. Keywords: explicit network, FPT, from network, from unrooted trees, NP complete, phylogenetic network, phylogeny, reconstruction, tree containment. Note: https://hal.inria.fr/hal01599716, to appear.



Sha Zhu and
James H. Degnan. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent. In SB, Vol. 66(2):283298, 2017. Keywords: branch length, coalescent, explicit network, from network, likelihood, phylogenetic network, phylogeny, Program Hybridcoal, Program HybridLambda, Program PhyloNet, software, uniqueness. Note: presentation available at https://www.youtube.com/watch?v=JLYGTfEZG7g.









Bingxin Lu,
Louxin Zhang and
Hon Wai Leong. A program to compute the soft RobinsonFoulds distance between phylogenetic networks. In APBC17, Vol. 18(Suppl. 2):111 of BMC Genomics, 2017. Keywords: cluster containment, distance between networks, explicit network, exponential algorithm, from network, phylogenetic network, phylogeny, Program iceluPhyloNetwork. Note: http://dx.doi.org/10.1186/s1286401735005.







Magnus Bordewich,
Charles Semple and
Nihan Tokac. Constructing treechild networks from distance matrices. In Algorithmica, 2017. Keywords: compressed network, explicit network, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, tree child network, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSN17.pdf, to appear.



Leo van Iersel,
Vincent Moulton,
Eveline De Swart and
Taoyang Wu. Binets: fundamental building blocks for phylogenetic networks. In BMB, Vol. 79(5):11351154, 2017. Keywords: approximation, explicit network, from binets, galled tree, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1007/s1153801702754.



Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Solving the Tree Containment Problem in Linear Time for Nearly Stable Phylogenetic Networks. In DAM, 2017. Keywords: explicit network, from network, from rooted trees, nearlystable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://halupecupem.archivesouvertes.fr/hal01575001/en/, to appear.

















Leo van Iersel,
Steven Kelk,
Nela Lekic,
Chris Whidden and
Norbert Zeh. Hybridization Number on Three Rooted Binary Trees is EPT. In SIDMA, Vol. 30(3):16071631, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1402.2136.



Katharina Huber,
Vincent Moulton,
Mike Steel and
Taoyang Wu. Folding and unfolding phylogenetic trees and networks. In JOMB, Vol. 73(6):17611780, 2016. Keywords: compressed network, explicit network, FUstable network, NP complete, phylogenetic network, phylogeny, tree containment, tree sibling network. Note: http://arxiv.org/abs/1506.04438.





Steven Kelk,
Leo van Iersel,
Celine Scornavacca and
Mathias Weller. Phylogenetic incongruence through the lens of Monadic Second Order logic. In JGAA, Vol. 20(2):189215, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, MSOL, phylogenetic network, phylogeny, reconstruction. Note: http://jgaa.info/accepted/2016/KelkIerselScornavaccaWeller2016.20.2.pdf.



Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Solving the Tree Containment Problem for Genetically Stable Networks in Quadratic Time. In IWOCA15, Vol. 9538:197208 of LNCS, springer, 2016. Keywords: explicit network, from network, from rooted trees, genetically stable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://halupecupem.archivesouvertes.fr/hal01226035 .



Vincent Ranwez,
Celine Scornavacca,
JeanPhilippe Doyon and
Vincent Berry. Inferring gene duplications, transfers and losses can be done in a discrete framework. In JOMB, Vol. 72(7):18111844, 2016. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction.





Hussein A. Hejase and
Kevin J. Liu. A scalability study of phylogenetic network inference methods using empirical datasets and simulations involving a single reticulation. Vol. 17(422):112, 2016. Keywords: abstract network, evaluation, from sequences, phylogenetic network, phylogeny, Program PhyloNet, Program PhyloNetworks SNaQ, reconstruction, simulation, unicyclic network. Note: http://dx.doi.org/10.1186/s1285901612771.



Philippe Gambette,
Leo van Iersel,
Steven Kelk,
Fabio Pardi and
Celine Scornavacca. Do branch lengths help to locate a tree in a phylogenetic network? In BMB, Vol. 78(9):17731795, 2016. Keywords: branch length, explicit network, FPT, from network, from rooted trees, NP complete, phylogenetic network, phylogeny, pseudopolynomial, time consistent network, tree containment, tree sibling network. Note: http://arxiv.org/abs/1607.06285.









Maria Anaya,
Olga AnipchenkoUlaj,
Aisha Ashfaq,
Joyce Chiu,
Mahedi Kaiser,
Max Shoji Ohsawa,
Megan Owen,
Ella Pavlechko,
Katherine St. John,
Shivam Suleria,
Keith Thompson and
Corrine Yap. On Determining if Treebased Networks Contain Fixed Trees. In BMB, Vol. 78(5):961969, 2016. Keywords: explicit network, FPT, NP complete, phylogenetic network, phylogeny, treebased network. Note: http://arxiv.org/abs/1602.02739.





James Oldman,
Taoyang Wu,
Leo van Iersel and
Vincent Moulton. TriLoNet: Piecing together small networks to reconstruct reticulate evolutionary histories. In MBE, Vol. 33(8):21512162, 2016. Keywords: explicit network, from trinets, galled tree, phylogenetic network, phylogeny, Program LEV1ATHAN, Program TriLoNet, reconstruction.







Leo van Iersel,
Steven Kelk and
Celine Scornavacca. Kernelizations for the hybridization number problem on multiple nonbinary trees. In JCSS, Vol. 82(6):10751089, 2016. Keywords: explicit network, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Treeduce, reconstruction. Note: https://arxiv.org/abs/1311.4045v3.



Mareike Fischer,
Leo van Iersel,
Steven Kelk and
Celine Scornavacca. On Computing The Maximum Parsimony Score Of A Phylogenetic Network. In SIDMA, Vol. 29(1):559585, 2015. Keywords: APX hard, cluster containment, explicit network, FPT, from network, from sequences, integer linear programming, level k phylogenetic network, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, Program MPNet, reconstruction, software. Note: http://arxiv.org/abs/1302.2430.







Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Locating a Tree in A Phylogenetic Network in Quadratic Time. In RECOMB15, Vol. 9029:96107 of LNCS, Springer, 2015. Keywords: evaluation, explicit network, from network, from rooted trees, genetically stable network, nearlystable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://hal.archivesouvertes.fr/hal01116231/en.



Jittat Fakcharoenphol,
Tanee Kumpijit and
Attakorn Putwattana. A Faster Algorithm for the Tree Containment Problem for Binary Nearly Stable Phylogenetic Networks. In Proceedings of the The 12th International Joint Conference on Computer Science and Software Engineering (JCSSE'15), Pages 337342, IEEE, 2015. Keywords: dynamic programming, explicit network, from network, from rooted trees, nearlystable network, phylogenetic network, phylogeny, polynomial, tree containment.



Benjamin Albrecht. Computing all hybridization networks for multiple binary phylogenetic input trees. In BMCB, Vol. 16(236):115, 2015. Keywords: agreement forest, explicit network, exponential algorithm, FPT, from rooted trees, phylogenetic network, phylogeny, Program Hybroscale, Program PIRN, reconstruction. Note: http://dx.doi.org/10.1186/s1285901506607.



Maxime Morgado. Propriétés structurelles et relations des classes de réseaux phylogénétiques. Master's thesis, ENS Cachan, 2015. Keywords: compressed network, distinctcluster network, explicit network, galled network, galled tree, level k phylogenetic network, nested network, normal network, phylogenetic network, phylogeny, regular network, spread, tree child network, tree containment, tree sibling network, treebased network, unicyclic network.



Yun Yu and
Luay Nakhleh. A maximum pseudolikelihood approach for phylogenetic networks. In RECOMBCG15, Vol. 16(Suppl 10)(S10):110 of BMC Genomics, BioMed Central, 2015. Keywords: explicit network, from rooted trees, hybridization, incomplete lineage sorting, likelihood, phylogenetic network, phylogeny, Program PhyloNet, reconstruction, tripartition distance. Note: http://dx.doi.org/10.1186/1471216416S10S10.



Gergely J. Szöllösi,
Adrián Arellano Davín,
Eric Tannier,
Vincent Daubin and
Bastien Boussau. Genomescale phylogenetic analysis finds extensive gene transfer among fungi. In Philosophical Transactions of the Royal Society of London B: Biological Sciences, Vol. 370(1678):111, 2015. Keywords: duplication, from sequences, lateral gene transfer, loss, phylogenetic network, phylogeny, Program ALE, reconstruction. Note: http://dx.doi.org/10.1098/rstb.2014.0335.







Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. The comparison of treesibling time consistent phylogenetic networks is graphisomorphism complete. In The Scientific World Journal, Vol. 2014(254279):16, 2014. Keywords: abstract network, distance between networks, from network, isomorphism, phylogenetic network, tree sibling network. Note: http://arxiv.org/abs/0902.4640.
Toggle abstract
"Several polynomial time computable metrics on the class of semibinary treesibling time consistent phylogenetic networks are available in the literature; in particular, the problem of deciding if two networks of this kind are isomorphic is in P. In this paper, we show that if we remove the semibinarity condition, then the problem becomes much harder. More precisely, we prove that the isomorphism problem for generic treesibling time consistent phylogenetic networks is polynomially equivalent to the graph isomorphism problem. Since the latter is believed not to belong to P, the chances are that it is impossible to define a metric on the class of all treesibling time consistent phylogenetic networks that can be computed in polynomial time. © 2014 Gabriel Cardona et al."



Steven Kelk and
Celine Scornavacca. Constructing minimal phylogenetic networks from softwired clusters is fixed parameter tractable. In ALG, Vol. 68(4):886915, 2014. Keywords: explicit network, FPT, from clusters, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1108.3653.
Toggle abstract
"Here we show that, given a set of clusters C on a set of taxa X, where X=n, it is possible to determine in time f(k)×poly(n) whether there exists a level≤k network (i.e. a network where each biconnected component has reticulation number at most k) that represents all the clusters in C in the softwired sense, and if so to construct such a network. This extends a result from Kelk et al. (in IEEE/ACM Trans. Comput. Biol. Bioinform. 9:517534, 2012) which showed that the problem is polynomialtime solvable for fixed k. By defining "kreticulation generators" analogous to "levelk generators", we then extend this fixed parameter tractability result to the problem where k refers not to the level but to the reticulation number of the whole network. © 2012 Springer Science+Business Media New York."





Leo van Iersel and
Vincent Moulton. Trinets encode treechild and level2 phylogenetic networks. In JOMB, Vol. 68(7):17071729, 2014. Keywords: explicit network, from trinets, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1210.0362.
Toggle abstract
"Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that level1 phylogenetic networks are encoded by their trinets, and also conjectured that all "recoverable" rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level2 networks and binary treechild networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets. © 2013 SpringerVerlag Berlin Heidelberg."



Leo van Iersel and
Steven Kelk. Kernelizations for the hybridization number problem on multiple nonbinary trees. In WG14, Vol. 8747:299311 of LNCS, springer, 2014. Keywords: explicit network, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Treeduce, reconstruction. Note: http://arxiv.org/abs/1311.4045.



Lavanya Kannan and
Ward C Wheeler. Exactly Computing the Parsimony Scores on Phylogenetic Networks Using Dynamic Programming. In JCB, Vol. 21(4):303319, 2014. Keywords: explicit network, exponential algorithm, from network, from sequences, parsimony, phylogenetic network, phylogeny, reconstruction.
Toggle abstract
"Scoring a given phylogenetic network is the first step that is required in searching for the best evolutionary framework for a given dataset. Using the principle of maximum parsimony, we can score phylogenetic networks based on the minimum number of state changes across a subset of edges of the network for each character that are required for a given set of characters to realize the input states at the leaves of the networks. Two such subsets of edges of networks are interesting in light of studying evolutionary histories of datasets: (i) the set of all edges of the network, and (ii) the set of all edges of a spanning tree that minimizes the score. The problems of finding the parsimony scores under these two criteria define slightly different mathematical problems that are both NPhard. In this article, we show that both problems, with scores generalized to adding substitution costs between states on the endpoints of the edges, can be solved exactly using dynamic programming. We show that our algorithms require O(mpk) storage at each vertex (per character), where k is the number of states the character can take, p is the number of reticulate vertices in the network, m = k for the problem with edge set (i), and m = 2 for the problem with edge set (ii). This establishes an O(nmpk2) algorithm for both the problems (n is the number of leaves in the network), which are extensions of Sankoff's algorithm for finding the parsimony scores for phylogenetic trees. We will discuss improvements in the complexities and show that for phylogenetic networks whose underlying undirected graphs have disjoint cycles, the storage at each vertex can be reduced to O(mk), thus making the algorithm polynomial for this class of networks. We will present some properties of the two approaches and guidance on choosing between the criteria, as well as traverse through the network space using either of the definitions. We show that our methodology provides an effective means to study a wide variety of datasets. © Copyright 2014, Mary Ann Liebert, Inc. 2014."



JohannMattis List,
Shijulal NelsonSathi,
Hans Geisler and
William Martin. Networks of lexical borrowing and lateral gene transfer in language and genome evolution. In BioEssays, Vol. 36(2):141150, 2014. Keywords: explicit network, minimal lateral network, phylogenetic network, Program lingpy. Note: http://dx.doi.org/10.1002/bies.201300096.
Toggle abstract
"Like biological species, languages change over time. As noted by Darwin, there are many parallels between language evolution and biological evolution. Insights into these parallels have also undergone change in the past 150 years. Just like genes, words change over time, and language evolution can be likened to genome evolution accordingly, but what kind of evolution? There are fundamental differences between eukaryotic and prokaryotic evolution. In the former, natural variation entails the gradual accumulation of minor mutations in alleles. In the latter, lateral gene transfer is an integral mechanism of natural variation. The study of language evolution using biological methods has attracted much interest of late, most approaches focusing on language tree construction. These approaches may underestimate the important role that borrowing plays in language evolution. Network approaches that were originally designed to study lateral gene transfer may provide more realistic insights into the complexities of language evolution. Editor's suggested further reading in BioEssays Linguistic evidence supports date for Homeric epics. © 2014 The Authors. BioEssays Published by WILEY Periodicals, Inc."



Zhijiang Li. FixedParameter Algorithm for Hybridization Number of Two Multifurcating Trees. Master's thesis, Dalhousie University, Canada, 2014. Keywords: agreement forest, explicit network, FPT, from rooted trees, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://hdl.handle.net/10222/53976.



Paul Cordue,
Simone Linz and
Charles Semple. Phylogenetic Networks that Display a Tree Twice. In BMB, Vol. 76(10):26642679, 2014. Keywords: from rooted trees, normal network, phylogenetic network, phylogeny, reconstruction, tree child network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/CLS14.pdf.
Toggle abstract
"In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the treelike information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratictime algorithm to solve this problem for a class of networks that is more general than treechild networks. © 2014, Society for Mathematical Biology."





Adrià Alcalà Mena,
Mercè Llabrés,
Francesc Rosselló and
Pau Rullan. TreeChild Cluster Networks. In Fundamenta Informaticae, Vol. 134(12):115, 2014. Keywords: explicit network, from clusters, phylogenetic network, phylogeny, Program PhyloNetwork, reconstruction, tree child network.



Leo van Iersel and
Simone Linz. A quadratic kernel for computing the hybridization number of multiple trees. In IPL, Vol. 113:318323, 2013. Keywords: explicit network, FPT, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Clustistic, Program MaafB, Program PIRN, reconstruction. Note: http://arxiv.org/abs/1203.4067, poster.
Toggle abstract
"It has recently been shown that the NPhard problem of calculating the minimum number of hybridization events that is needed to explain a set of rooted binary phylogenetic trees by means of a hybridization network is fixedparameter tractable if an instance of the problem consists of precisely two such trees. In this paper, we show that this problem remains fixedparameter tractable for an arbitrarily large set of rooted binary phylogenetic trees. In particular, we present a quadratic kernel. © 2013 Elsevier B.V."



Chris Whidden,
Robert G. Beiko and
Norbert Zeh. FixedParameter Algorithms for Maximum Agreement Forests. In SICOMP, Vol. 42(4):14311466, 2013. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: http://arxiv.org/abs/1108.2664, slides.
Toggle abstract
"We present new and improved fixedparameter algorithms for computing maximum agreement forests of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree pruneandregraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution. Our algorithm for computing maximum acyclic agreement forests is the first depthbounded search algorithm for this problem. Our algorithms substantially outperform the best previous algorithms for these problems. © 2013 Society for Industrial and Applied Mathematics."



Stefan Grünewald,
Andreas Spillner,
Sarah Bastkowski,
Anja Bögershausen and
Vincent Moulton. SuperQ: Computing Supernetworks from Quartets. In TCBB, Vol. 10(1):151160, 2013. Keywords: abstract network, circular split system, from quartets, heuristic, phylogenetic network, phylogeny, Program QNet, Program SplitsTree, Program SuperQ, software, split network.
Toggle abstract
"Supertrees are a commonly used tool in phylogenetics to summarize collections of partial phylogenetic trees. As a generalization of supertrees, phylogenetic supernetworks allow, in addition, the visual representation of conflict between the trees that is not possible to observe with a single tree. Here, we introduce SuperQ, a new method for constructing such supernetworks (SuperQ is freely available at >www.uea.ac.uk/computing/superq.). It works by first breaking the input trees into quartet trees, and then stitching these together to form a special kind of phylogenetic network, called a split network. This stitching process is performed using an adaptation of the QNet method for split network reconstruction employing a novel approach to use the branch lengths from the input trees to estimate the branch lengths in the resulting network. Compared with previous supernetwork methods, SuperQ has the advantage of producing a planar network. We compare the performance of SuperQ to the Zclosure and Qimputation supernetwork methods, and also present an analysis of some published data sets as an illustration of its applicability. © 20042012 IEEE."



Teresa Piovesan and
Steven Kelk. A simple fixed parameter tractable algorithm for computing the hybridization number of two (not necessarily binary) trees. In TCBB, Vol. 10(1):1825, 2013. Keywords: FPT, from rooted trees, phylogenetic network, phylogeny, Program TerminusEst, reconstruction. Note: http://arxiv.org/abs/1207.6090.
Toggle abstract
"Here, we present a new fixed parameter tractable algorithm to compute the hybridization number (r) of two rooted, not necessarily binary phylogenetic trees on taxon set (X) in time ((6r r) · poly(n)), where (n= X). The novelty of this approach is its use of terminals, which are maximal elements of a natural partial order on (X), and several insights from the softwired clusters literature. This yields a surprisingly simple and practical boundedsearch algorithm and offers an alternative perspective on the underlying combinatorial structure of the hybridization number problem. © 20042012 IEEE."



Stephen J. Willson. Reconstruction of certain phylogenetic networks from their treeaverage distances. In BMB, Vol. 75(10):18401878, 2013. Keywords: explicit network, from distances, galled tree, normal network, phylogenetic network, phylogeny, unicyclic network. Note: http://www.public.iastate.edu/~swillson/TreeAverageReconPaper9.pdf.
Toggle abstract
"Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the treeaverage distances between the leaves. For a certain class of phylogenetic networks, a polynomialtime method is presented to reconstruct the network from the treeaverage distances. The method is proved to work if there is a single reticulation cycle. © 2013 Society for Mathematical Biology."



Peter J. Humphries,
Simone Linz and
Charles Semple. On the complexity of computing the temporal hybridization number for two phylogenies. In DAM, Vol. 161:871880, 2013. Keywords: agreement forest, APX hard, characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/TAFapx.pdf.
Toggle abstract
"Phylogenetic networks are now frequently used to explain the evolutionary history of a set of species for which a collection of gene trees, reconstructed from genetic material of different parts of the species' genomes, reveal inconsistencies. However, in the context of hybridization, the reconstructed networks are often not temporal. If a hybridization network is temporal, then it satisfies the time constraint of instantaneously occurring hybridization events; i.e. all species that are involved in such an event coexist in time. Furthermore, although a collection of phylogenetic trees can often be merged into a hybridization network that is temporal, many algorithms do not necessarily find such a network since their primary optimization objective is to minimize the number of hybridization events. In this paper, we present a characterization for when two rooted binary phylogenetic trees admit a temporal hybridization network. Furthermore, we show that the underlying optimization problem is APXhard and, therefore, NPhard. Thus, unless P=NP, it is unlikely that there are efficient algorithms for either computing an exact solution or approximating it within a ratio arbitrarily close to one. © 2012 Elsevier B.V. All rights reserved."



Yufeng Wu. An Algorithm for Constructing Parsimonious Hybridization Networks with Multiple Phylogenetic Trees. In RECOMB13, Vol. 7821:291303 of LNCS, springer, 2013. Keywords: explicit network, exponential algorithm, from rooted trees, phylogenetic network, phylogeny, Program PIRN, reconstruction. Note: http://www.engr.uconn.edu/~ywu/Papers/ExactNetRecomb2013.pdf.
Toggle abstract
"Phylogenetic network is a model for reticulate evolution. Hybridization network is one type of phylogenetic network for a set of discordant gene trees, and "displays" each gene tree. A central computational problem on hybridization networks is: given a set of gene trees, reconstruct the minimum (i.e. most parsimonious) hybridization network that displays each given gene tree. This problem is known to be NPhard, and existing approaches for this problem are either heuristics or make simplifying assumptions (e.g. work with only two input trees or assume some topological properties). In this paper, we develop an exact algorithm (called PIRNC ) for inferring the minimum hybridization networks from multiple gene trees. The PIRNC algorithm does not rely on structural assumptions. To the best of our knowledge, PIRN C is the first exact algorithm for this formulation. When the number of reticulation events is relatively small (say four or fewer), PIRNC runs reasonably efficient even for moderately large datasets. For building more complex networks, we also develop a heuristic version of PIRNC called PIRNCH. Simulation shows that PIRNCH usually produces networks with fewer reticulation events than those by an existing method. © 2013 SpringerVerlag."



ThiHau Nguyen,
Vincent Ranwez,
Stéphanie Pointet,
AnneMuriel Chifolleau Arigon,
JeanPhilippe Doyon and
Vincent Berry. Reconciliation and local gene tree rearrangement can be of mutual profit. In ALMOB, Vol. 8(12), 2013. Keywords: duplication, explicit network, from rooted trees, heuristic, lateral gene transfer, phylogenetic network, phylogeny, Program Mowgli, Program MowgliNNI, Program Prunier, reconstruction, software.
Toggle abstract
"Background: Reconciliation methods compare gene trees and species trees to recover evolutionary events such as duplications, transfers and losses explaining the history and composition of genomes. It is wellknown that gene trees inferred from molecular sequences can be partly erroneous due to incorrect sequence alignments as well as phylogenetic reconstruction artifacts such as long branch attraction. In practice, this leads reconciliation methods to overestimate the number of evolutionary events. Several methods have been proposed to circumvent this problem, by collapsing the unsupported edges and then resolving the obtained multifurcating nodes, or by directly rearranging the binary gene trees. Yet these methods have been defined for models of evolution accounting only for duplications and losses, i.e. can not be applied to handle prokaryotic gene families.Results: We propose a reconciliation method accounting for gene duplications, losses and horizontal transfers, that specifically takes into account the uncertainties in gene trees by rearranging their weakly supported edges. Rearrangements are performed on edges having a low confidence value, and are accepted whenever they improve the reconciliation cost. We prove useful properties on the dynamic programming matrix used to compute reconciliations, which allows to speedup the tree space exploration when rearrangements are generated by Nearest Neighbor Interchanges (NNI) edit operations. Experiments on synthetic data show that gene trees modified by such NNI rearrangements are closer to the correct simulated trees and lead to better event predictions on average. Experiments on real data demonstrate that the proposed method leads to a decrease in the reconciliation cost and the number of inferred events. Finally on a dataset of 30 k gene families, this reconciliation method shows a ranking of prokaryotic phyla by transfer rates identical to that proposed by a different approach dedicated to transfer detection [BMCBIOINF 11:324, 2010, PNAS 109(13):49624967, 2012].Conclusions: Prokaryotic gene trees can now be reconciled with their species phylogeny while accounting for the uncertainty of the gene tree. More accurate and more precise reconciliations are obtained with respect to previous parsimony algorithms not accounting for such uncertainties [LNCS 6398:93108, 2010, BIOINF 28(12): i283i291, 2012].A software implementing the method is freely available at http://www.atgcmontpellier.fr/Mowgli/. © 2013 Nguyen et al.; licensee BioMed Central Ltd."





Yun Yu,
R. Matthew Barnett and
Luay Nakhleh. Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting. In Systematic Biology, Vol. 62(5):738751, 2013. Keywords: from network, from rooted trees, hybridization, lineage sorting, parsimony, phylogenetic network, phylogeny, Program PhyloNet, reconstruction.
Toggle abstract
"Hybridization plays an important evolutionary role in several groups of organisms. A phylogenetic approach to detect hybridization entails sequencing multiple loci across the genomes of a group of species of interest, reconstructing their gene trees, and taking their differences as indicators of hybridization. However, methods that follow this approach mostly ignore population effects, such as incomplete lineage sorting (ILS). Given that hybridization occurs between closely related organisms, ILS may very well be at play and, hence, must be accounted for in the analysis framework. To address this issue, we present a parsimony criterion for reconciling gene trees within the branches of a phylogenetic network, and a local search heuristic for inferring phylogenetic networks from collections of genetree topologies under this criterion. This framework enables phylogenetic analyses while accounting for both hybridization and ILS. Further, we propose two techniques for incorporating information about uncertainty in genetree estimates. Our simulation studies demonstrate the good performance of our framework in terms of identifying the location of hybridization events, as well as estimating the proportions of genes that underwent hybridization. Also, our framework shows good performance in terms of efficiency on handling large data sets in our experiments. Further, in analysing a yeast data set, we demonstrate issues that arise when analysing real data sets. Although a probabilistic approach was recently introduced for this problem, and although parsimonious reconciliations have accuracy issues under certain settings, our parsimony framework provides a much more computationally efficient technique for this type of analysis. Our framework now allows for genomewide scans for hybridization, while also accounting for ILS. [Phylogenetic networks; hybridization; incomplete lineage sorting; coalescent; multilabeled trees.] © 2013 The Author(s). All rights reserved."





Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):18791890, 2013. Keywords: characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimumtemporalhybridization number for a set P of rooted binary phylogenetic trees is NPhard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimumtemporal hybridization number for two trees is fixedparameter tractable. © 2013 Society for Mathematical Biology."









Gergely J. Szöllösi,
Wojciech Rosikiewicz,
Bastien Boussau,
Eric Tannier and
Vincent Daubin. Efficient Exploration of the Space of Reconciled Gene Trees. In Systematic Biology, Vol. 62(6):901912, 2013. Keywords: duplication, explicit network, lateral gene transfer, likelihood, loss, phylogeny, Program ALE, reconstruction. Note: http://arxiv.org/abs/1306.2167.
Toggle abstract
"Gene trees record the combination of genelevel events, such as duplication, transfer and loss (DTL), and specieslevel events, such as speciation and extinction. Gene treespecies tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and specieslevel events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene treespecies tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllo{double acute}si et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihoodbased inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. © The Author(s) 2013."





Chris Whidden. Efficient Computation and Application of Maximum Agreement Forests. PhD thesis, Dalhousie University, Canada, 2013. Keywords: agreement forest, explicit network, FPT, from rooted trees, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://hdl.handle.net/10222/35349.







Philippe Gambette and
Katharina Huber. On Encodings of Phylogenetic Networks of Bounded Level. In JOMB, Vol. 65(1):157180, 2012. Keywords: characterization, explicit network, from clusters, from rooted trees, from triplets, galled tree, identifiability, level k phylogenetic network, phylogenetic network, uniqueness, weak hierarchy. Note: http://hal.archivesouvertes.fr/hal00609130/en/.
Toggle abstract
"Phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the wellstudied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i. e. uniquely describe) the network that induces it? For the large class of level1 (phylogenetic) networks we characterize those level1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. In addition, we show that three known distance measures for comparing phylogenetic networks are in fact metrics on the resulting subclass and give the diameter for two of them. Finally, we investigate the related concept of indistinguishability and also show that many properties enjoyed by level1 networks are not satisfied by networks of higher level. © 2011 SpringerVerlag."



Steven Kelk,
Celine Scornavacca and
Leo van Iersel. On the elusiveness of clusters. In TCBB, Vol. 9(2):517534, 2012. Keywords: explicit network, from clusters, from rooted trees, from triplets, level k phylogenetic network, phylogenetic network, phylogeny, Program Clustistic, reconstruction, software. Note: http://arxiv.org/abs/1103.1834.



Celine Scornavacca,
Simone Linz and
Benjamin Albrecht. A first step towards computing all hybridization networks for two rooted binary phylogenetic trees. In JCB, Vol. 19:12271242, 2012. Keywords: agreement forest, explicit network, FPT, from rooted trees, phylogenetic network, phylogeny, Program Dendroscope, Program Hybroscale, reconstruction. Note: http://arxiv.org/abs/1109.3268.
Toggle abstract
"Recently, considerable effort has been put into developing fast algorithms to reconstruct a rooted phylogenetic network that explains two rooted phylogenetic trees and has a minimum number of hybridization vertices. With the standard app1235roach to tackle this problem being combinatorial, the reconstructed network is rarely unique. From a biological point of view, it is therefore of importance to not only compute one network, but all possible networks. In this article, we make a first step toward approaching this goal by presenting the first algorithmcalled allMAAFsthat calculates all maximumacyclicagreement forests for two rooted binary phylogenetic trees on the same set of taxa. © Copyright 2012, Mary Ann Liebert, Inc. 2012."



Stephen J. Willson. Treeaverage distances on certain phylogenetic networks have their weights uniquely determined. In ALMOB, Vol. 7(13), 2012. Keywords: from distances, from network, normal network, phylogenetic network, phylogeny, reconstruction, tree child network. Note: hhttp://www.public.iastate.edu/~swillson/TreeAverageDis10All.pdf.
Toggle abstract
"A phylogenetic network N has vertices corresponding to species and arcs corresponding to direct genetic inheritance from the species at the tail to the species at the head. Measurements of DNA are often made on species in the leaf set, and one seeks to infer properties of the network, possibly including the graph itself. In the case of phylogenetic trees, distances between extant species are frequently used to infer the phylogenetic trees by methods such as neighborjoining.This paper proposes a treeaverage distance for networks more general than trees. The notion requires a weight on each arc measuring the genetic change along the arc. For each displayed tree the distance between two leaves is the sum of the weights along the path joining them. At a hybrid vertex, each character is inherited from one of its parents. We will assume that for each hybrid there is a probability that the inheritance of a character is from a specified parent. Assume that the inheritance events at different hybrids are independent. Then for each displayed tree there will be a probability that the inheritance of a given character follows the tree; this probability may be interpreted as the probability of the tree. The treeaverage distance between the leaves is defined to be the expected value of their distance in the displayed trees.For a class of rooted networks that includes rooted trees, it is shown that the weights and the probabilities at each hybrid vertex can be calculated given the network and the treeaverage distances between the leaves. Hence these weights and probabilities are uniquely determined. The hypotheses on the networks include that hybrid vertices have indegree exactly 2 and that vertices that are not leaves have a treechild. © 2012 Willson; licensee BioMed Central Ltd."



Steven Kelk,
Leo van Iersel,
Nela Lekic,
Simone Linz,
Celine Scornavacca and
Leen Stougie. Cycle killer... qu'estce que c'est? On the comparative approximability of hybridization number and directed feedback vertex set. In SIDMA, Vol. 26(4):16351656, 2012. Keywords: agreement forest, approximation, explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program CycleKiller, reconstruction. Note: http://arxiv.org/abs/1112.5359, about the title.
Toggle abstract
"We show that the problem of computing the hybridization number of two rooted binary phylogenetic trees on the same set of taxa X has a constant factor polynomialtime approximation if and only if the problem of computing a minimumsize feedback vertex set in a directed graph (DFVS) has a constant factor polynomialtime approximation. The latter problem, which asks for a minimum number of vertices to be removed from a directed graph to transform it into a directed acyclic graph, is one of the problems in Karp's seminal 1972 list of 21 NPcomplete problems. Despite considerable attention from the combinatorial optimization community, it remains to this day unknown whether a constant factor polynomialtime approximation exists for DFVS. Our result thus places the (in)approximability of hybridization number in a much broader complexity context, and as a consequence we obtain that it inherits inapproximability results from the problem Vertex Cover. On the positive side, we use results from the DFVS literature to give an O(log r log log r) approximation for the hybridization number where r is the correct value. Copyright © by SIAM."



Philippe Gambette,
Vincent Berry and
Christophe Paul. Quartets and Unrooted Phylogenetic Networks. In JBCB, Vol. 10(4):1250004, 2012. Keywords: abstract network, circular split system, explicit network, from quartets, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction, split, split network. Note: http://hal.archivesouvertes.fr/hal00678046/en/.
Toggle abstract
"Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of levelk networks. In particular, we give an equivalence theorem between circular split systems and unrooted level1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted levelk phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions. © 2012 Imperial College Press."





Yun Yu,
James H. Degnan and
Luay Nakhleh. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. In PLoS Genetics, Vol. 8(4):e1002660, 2012. Keywords: AIC, BIC, explicit network, hybridization, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1371/journal.pgen.1002660.
Toggle abstract
"Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. © 2012 Yu et al."



AnChiang Chu,
Jesper Jansson,
Richard Lemence,
Alban Mancheron and
KunMao Chao. Asymptotic Limits of a New Type of Maximization Recurrence with an Application to Bioinformatics. In TAMC12, Vol. 7287:177188 of LNCS, springer, 2012. Keywords: from triplets, galled network, level k phylogenetic network, phylogenetic network. Note: preliminary version.
Toggle abstract
"We study the asymptotic behavior of a new type of maximization recurrence, defined as follows. Let k be a positive integer and p k(x) a polynomial of degree k satisfying p k(0) = 0. Define A 0 = 0 and for n ≥ 1, let A n = max 0≤i<n{A i+n kp k(i/n)}. We prove that lim n→∞A n/n n = sup{pk(x)/1x k : 0≤x<1}. We also consider two closely related maximization recurrences S n and S′ n, defined as S 0 = S′ 0 = 0, and for n ≥ 1, S n = max 0≤i<n{S i + i(ni)(ni1)/2} and S′ n = max 0≤i<n{S′ i + ( 3 ni) + 2i( 2 ni) + (ni)( 2 i)}. We prove that lim n→∞ S′n/3( 3 n) = 2(√31)/3 ≈ 0.488033..., resolving an open problem from Bioinformatics about rooted triplets consistency in phylogenetic networks. © 2012 SpringerVerlag."



Tetsuo Asano,
Jesper Jansson,
Kunihiko Sadakane,
Ryuhei Uehara and
Gabriel Valiente. Faster computation of the Robinson–Foulds distance between phylogenetic networks. In Information Sciences, Vol. 197:7790, 2012. Keywords: distance between networks, explicit network, level k phylogenetic network, phylogenetic network, polynomial, spread.
Toggle abstract
"The RobinsonFoulds distance, a widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two phylogenetic networks N 1, N 2 with n leaf labels and at most m nodes and e edges each, the RobinsonFoulds distance measures the number of clusters of descendant leaves not shared by N 1 and N 2. The fastest known algorithm for computing the RobinsonFoulds distance between N 1 and N 2 runs in O(me) time. In this paper, we improve the time complexity to O(ne/log n) for general phylogenetic networks and O(nm/log n) for general phylogenetic networks with bounded degree (assuming the word RAM model with a word length of ⌈logn⌉ bits), and to optimal O(m) time for leafouterplanar networks as well as optimal O(n) time for level1 phylogenetic networks (that is, galledtrees). We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a levelk network is at most k + 1, which implies that for one level1 and one levelk phylogenetic network, our algorithm runs in O((k + 1)e) time. © 2012 Elsevier Inc. All rights reserved."



Michel Habib and
ThuHien To. Constructing a Minimum Phylogenetic Network from a Dense Triplet Set. In JBCB, Vol. 10(5):1250013, 2012. Keywords: explicit network, from triplets, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1103.2266.
Toggle abstract
"For a given set L of species and a set T of triplets on L, we seek to construct a phylogenetic network which is consistent with T i.e. which represents all triplets of T. The level of a network is defined as the maximum number of hybrid vertices in its biconnected components. When T is dense, there exist polynomial time algorithms to construct level0,1 and 2 networks (Aho et al., 1981; Jansson, Nguyen and Sung, 2006; Jansson and Sung, 2006; Iersel et al., 2009). For higher levels, partial answers were obtained in the paper by Iersel and Kelk (2008), with a polynomial time algorithm for simple networks. In this paper, we detail the first complete answer for the general case, solving a problem proposed in Jansson and Sung (2006) and Iersel et al. (2009). For any k fixed, it is possible to construct a levelk network having the minimum number of hybrid vertices and consistent with T, if there is any, in time O(T k+1 n⌊4k/3⌋+1). © 2012 Imperial College Press."





Devin Robert Bickner. On normal networks. PhD thesis, Iowa State University, U.S.A., 2012. Keywords: distance between networks, explicit network, from network, from trees, normal network, phylogenetic network, phylogeny, polynomial, reconstruction, SPR distance. Note: http://gradworks.umi.com/3511361.pdf.





Adrià Alcalà Mena. Trivalent Graph isomorphism in polynomial time. Master's thesis, Universidad de Cantabria, Spain, 2012. Keywords: distance between networks, explicit network, from network, isomorphism, phylogenetic network, phylogeny, polynomial, Program SAGE. Note: http://arxiv.org/abs/1209.1040.



Joseph K. Pickrell and
Jonathan K. Pritchard. Inference of Population Splits and Mixtures from GenomeWide Allele Frequency Data. In PLoS Genetics, Vol. 8(11):e1002967, 2012. Keywords: explicit network, heuristic, likelihood, phylogenetic network, phylogeny, population genetics, Program TreeMix. Note: http://dx.doi.org/10.1371/journal.pgen.1002967.
Toggle abstract
"Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genomewide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com. © 2012 Pickrell, Pritchard."



Katharina Huber,
Vincent Moulton,
Andreas Spillner,
Sabine Storandt and
Radoslaw Suchecki. Computing a consensus of multilabeled trees. In ALENEX12, Pages 8492, 2012. Keywords: duplication, explicit network, exponential algorithm, phylogenetic network, phylogeny. Note: http://siam.omnibooksonline.com/2012ALENEX/data/papers/020.pdf.
Toggle abstract
In this paper we consider two challenging problems that arise in the context of computing a consensus of a collection of multilabeled trees, namely (1) selecting a compatible collection of clusters on a multiset from an ordered list of such clusters and (2) optimally refining high degree vertices in a multilabeled tree. Forming such a consensus is part of an approach to reconstruct the evolutionary history of a set of species for which events such as genome duplication and hybridization have occurred in the past. We present exact algorithms for solving (1) and (2) that have an exponential runtime in the worst case. To give some impression of their performance in practice, we apply them to simulated input and to a real biological data set highlighting the impact of several structural properties of the input on the performance.



ZhiZhong Chen,
Fei Deng and
Lusheng Wang. Simultaneous Identification of Duplications, Losses, and Lateral Gene Transfers. In TCBB, Vol. 9(5):15151528, 2012. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction. Note: http://www.cs.cityu.edu.hk/~lwang/research/tcbb2012c.pdf.
Toggle abstract
"We give a fixedparameter algorithm for the problem of enumerating all minimumcost LCAreconciliations involving gene duplications, gene losses, and lateral gene transfers (LGTs) for a given species tree S and a given gene tree G. Our algorithm can work for the weighted version of the problem, where the costs of a gene duplication, a gene loss, and an LGT are left to the user's discretion. The algorithm runs in O(m+3 k/c n) time, where m is the number of vertices in S, n is the number of vertices in G, c is the smaller between a gene duplication cost and an LGT cost, and k is the minimum cost of an LCAreconciliation between S and G. The time complexity is indeed better if the cost of a gene loss is greater than 0. In particular, when the cost of a gene loss is at least 0.614c, the running time of the algorithm is O(m+2.78 k/cn). © 20042012 IEEE."



Ali Tofigh,
Mike Hallett and
Jens Lagergren. Simultaneous Identification of Duplications and Lateral Gene Transfers. In TCBB, Vol. 8(2):517535, 2011. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1109/TCBB.2010.14.
Toggle abstract
"The incongruency between a gene tree and a corresponding species tree can be attributed to evolutionary events such as gene duplication and gene loss. This paper describes a combinatorial model where socalled DTLscenarios are used to explain the differences between a gene tree and a corresponding species tree taking into account gene duplications, gene losses, and lateral gene transfers (also known as horizontal gene transfers). The reasonable biological constraint that a lateral gene transfer may only occur between contemporary species leads to the notion of acyclic DTLscenarios. Parsimony methods are introduced by defining appropriate optimization problems. We show that finding most parsimonious acyclic DTLscenarios is NPhard. However, by dropping the condition of acyclicity, the problem becomes tractable, and we provide a dynamic programming algorithm as well as a fixedparameter tractable algorithm for finding most parsimonious DTLscenarios. © 2011 IEEE."



Josh Voorkamp né Collins,
Simone Linz and
Charles Semple. Quantifying hybridization in realistic time. In JCB, Vol. 18(10):13051318, 2011. Keywords: explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, software. Note: http://wwwcsif.cs.ucdavis.edu/~linzs/CLS10_interleave.pdf, software available at http://www.math.canterbury.ac.nz/~c.semple/software.shtml.
Toggle abstract
"Recently, numerous practical and theoretical studies in evolutionary biology aim at calculating the extent to which reticulationfor example, horizontal gene transfer, hybridization, or recombinationhas influenced the evolution for a set of presentday species. It has been shown that inferring the minimum number of hybridization events that is needed to simultaneously explain the evolutionary history for a set of trees is an NPhard and also fixedparameter tractable problem. In this article, we give a new fixedparameter algorithm for computing the minimum number of hybridization events for when two rooted binary phylogenetic trees are given. This newly developed algorithm is based on interleavinga technique using repeated kernelization steps that are applied throughout the exhaustive search part of a fixedparameter algorithm. To show that our algorithm runs efficiently to be applicable to a wide range of practical problem instances, we apply it to a grass data set and highlight the significant improvements in terms of running times in comparison to an algorithm that has previously been implemented. © 2011, Mary Ann Liebert, Inc."



Leo van Iersel and
Steven Kelk. Constructing the Simplest Possible Phylogenetic Network from Triplets. In ALG, Vol. 60(2):207235, 2011. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, minimum number, phylogenetic network, phylogeny, polynomial, Program Marlon, Program Simplistic. Note: http://dx.doi.org/10.1007/s0045300993330.
Toggle abstract
"A phylogenetic network is a directed acyclic graph that visualizes an evolutionary history containing socalled reticulations such as recombinations, hybridizations or lateral gene transfers. Here we consider the construction of a simplest possible phylogenetic network consistent with an input set T, where T contains at least one phylogenetic tree on three leaves (a triplet) for each combination of three taxa. To quantify the complexity of a network we consider both the total number of reticulations and the number of reticulations per biconnected component, called the level of the network. We give polynomialtime algorithms for constructing a level1 respectively a level2 network that contains a minimum number of reticulations and is consistent with T (if such a network exists). In addition, we show that if T is precisely equal to the set of triplets consistent with some network, then we can construct such a network with smallest possible level in time O(T k+1), if k is a fixed upper bound on the level of the network. © 2009 The Author(s)."



JeanPhilippe Doyon,
Celine Scornavacca,
Konstantin Yu Gorbunov,
Gergely J. Szöllösi,
Vincent Ranwez and
Vincent Berry. An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications, and transfers. In Proceedings of the Eighth RECOMB Comparative Genomics Satellite Workshop (RECOMBCG'10), Vol. 6398:93108 of LNCS, springer, 2011. Keywords: branch length, duplication, dynamic programming, explicit network, from multilabeled tree, from species tree, from unrooted trees, lateral gene transfer, loss, phylogenetic network, phylogeny, polynomial, Program Mowgli, reconstruction. Note: http://www.lirmm.fr/~vberry/Publis/MPRDoyonEtAl.pdf, software available at http://www.atgcmontpellier.fr/MPR/.
Toggle abstract
"Tree reconciliation methods aim at estimating the evolutionary events that cause discrepancy between gene trees and species trees. We provide a discrete computational model that considers duplications, transfers and losses of genes. The model yields a fast and exact algorithm to infer time consistent and most parsimonious reconciliations. Then we study the conditions under which parsimony is able to accurately infer such events. Overall, it performs well even under realistic rates, transfers being in general less accurately recovered than duplications. An implementation is freely available at http://www.atgc montpellier.fr/MPR. © 2010 SpringerVerlag."



Leo van Iersel and
Steven Kelk. When two trees go to war. In JTB, Vol. 269(1):245255, 2011. Keywords: APX hard, explicit network, from clusters, from rooted trees, from sequences, from triplets, level k phylogenetic network, minimum number, NP complete, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1004.5332.
Toggle abstract
"Rooted phylogenetic networks are used to model nontreelike evolutionary histories. Such networks are often constructed by combining trees, clusters, triplets or characters into a single network that in some welldefined sense simultaneously represents them all. We review these four models and investigate how they are related. Motivated by the parsimony principle, one often aims to construct a network that contains as few reticulations (nontreelike evolutionary events) as possible. In general, the model chosen influences the minimum number of reticulation events required. However, when one obtains the input data from two binary (i.e. fully resolved) trees, we show that the minimum number of reticulations is independent of the model. The number of reticulations necessary to represent the trees, triplets, clusters (in the softwired sense) and characters (with unrestricted multiple crossover recombination) are all equal. Furthermore, we show that these results also hold when not the number of reticulations but the level of the constructed network is minimised. We use these unification results to settle several computational complexity questions that have been open in the field for some time. We also give explicit examples to show that already for data obtained from three binary trees the models begin to diverge. © 2010 Elsevier Ltd."





Lavanya Kannan,
Hua Li and
Arcady Mushegian. A PolynomialTime Algorithm Computing Lower and Upper Bounds of the Rooted Subtree Prune and Regraft Distance. In JCB, Vol. 18(5):743757, 2011. Keywords: bound, minimum number, polynomial, SPR distance. Note: http://dx.doi.org/10.1089/cmb.2010.0045.
Toggle abstract
"Rooted, leaflabeled trees are used in biology to represent hierarchical relationships of various entities, most notably the evolutionary history of molecules and organisms. Rooted Subtree Prune and Regraft (rSPR) operation is a tree rearrangement operation that is used to transform a tree into another tree that has the same set of leaf labels. The minimum number of rSPR operations that transform one tree into another is denoted by drSPR and gives a measure of dissimilarity between the trees, which can be used to compare trees obtained by different approaches, or, in the context of phylogenetic analysis, to detect horizontal gene transfer events by finding incongruences between trees of different evolving characters. The problem of computing the exact d rSPR measure is NPhard, and most algorithms resort to finding sequences of rSPR operations that are sufficient for transforming one tree into another, thereby giving upper bound heuristics for the distance. In this article, we present an O(n4) recursive algorithm DClust that gives both lower bound and upper bound heuristics for the distance between trees with n shared leaves and also gives a sequence of operations that transforms one tree into another. Our experiments on simulated pairs of trees containing up to 100 leaves showed that the two bounds are almost equal for small distances, thereby giving the nearlyprecise actual value, and that the upper bound tends to be close to the upper bounds given by other approaches for all pairs of trees. © Copyright 2011, Mary Ann Liebert, Inc. 2011."







Jaroslaw Byrka,
Pawel Gawrychowski,
Katharina Huber and
Steven Kelk. Worstcase optimal approximation algorithms for maximizing triplet consistency within phylogenetic networks. In Journal of Discrete Algorithms, Vol. 8(1):6575, 2010. Keywords: approximation, explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/0710.3258.
Toggle abstract
"The study of phylogenetic networks is of great interest to computational evolutionary biology and numerous different types of such structures are known. This article addresses the following question concerning rooted versions of phylogenetic networks. What is the maximum value of p ∈ [0, 1] such that for every input set T of rooted triplets, there exists some network N such that at least p  T  of the triplets are consistent with N? We call an algorithm that computes such a network (where p is maximum) worstcase optimal. Here we prove that the set containing all triplets (the full triplet set) in some sense defines p. Moreover, given a network N that obtains a fraction p′ for the full triplet set (for any p′), we show how to efficiently modify N to obtain a fraction ≥ p′ for any given triplet set T. We demonstrate the power of this insight by presenting a worstcase optimal result for level1 phylogenetic networks improving considerably upon the 5/12 fraction obtained recently by Jansson, Nguyen and Sung. For level2 phylogenetic networks we show that p ≥ 0.61. We emphasize that, because we are taking  T  as a (trivial) upper bound on the size of an optimal solution for each specific input T, the results in this article do not exclude the existence of approximation algorithms that achieve approximation ratio better than p. Finally, we note that all the results in this article also apply to weighted triplet sets. © 2009 Elsevier B.V. All rights reserved."



ZhiZhong Chen and
Lusheng Wang. HybridNET: a tool for constructing hybridization networks. In BIO, Vol. 26(22):29122913, 2010. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNET, software. Note: http://rnc.r.dendai.ac.jp/~chen/papers/note2.pdf.
Toggle abstract
"Motivations: When reticulation events occur, the evolutionary history of a set of existing species can be represented by a hybridization network instead of an evolutionary tree. When studying the evolutionary history of a set of existing species, one can obtain a phylogenetic tree of the set of species with high confidence by looking at a segment of sequences or a set of genes. When looking at another segment of sequences, a different phylogenetic tree can be obtained with high confidence too. This indicates that reticulation events may occur. Thus, we have the following problem: given two rooted phylogenetic trees on a set of species that correctly represent the treelike evolution of different parts of their genomes, what is the hybridization network with the smallest number of reticulation events to explain the evolution of the set of species under consideration? Results: We develop a program, named HybridNet, for constructing a hybridization network with the minimum number of reticulate vertices from two input trees. We first implement the O(3dn)time algorithm by Whidden et al. for computing a maximum (acyclic) agreement forest. Our program can output all the maximum (acyclic) agreement forests. We then augment the program so that it can construct an optimal hybridization network for each given maximum acyclic agreement forest. To our knowledge, this is the first time that optimal hybridization networks can be rapidly constructed. © The Author 2010. Published by Oxford University Press. All rights reserved."



Philippe Gambette. Méthodes combinatoires de reconstruction de réseaux phylogénétiques. PhD thesis, Université Montpellier 2, France, 2010. Keywords: abstract network, characterization, circular split system, explicit network, FPT, from clusters, from triplets, integer linear programming, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, Program Dendroscope, pyramid, reconstruction, split network, weak hierarchy. Note: http://tel.archivesouvertes.fr/tel00608342/en/.



Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Path lengths in treechild time consistent hybridization networks. In Information Sciences, Vol. 180(3):366383, 2010. Keywords: distance between networks, phylogenetic network, phylogeny, time consistent network, tree child network. Note: http://arxiv.org/abs/0807.0087?context=cs.CE.
Toggle abstract
"Hybridization networks are representations of evolutionary histories that allow for the inclusion of reticulate events like recombinations, hybridizations, or lateral gene transfers. The recent growth in the number of hybridization network reconstruction algorithms has led to an increasing interest in the definition of metrics for their comparison that can be used to assess the accuracy or robustness of these methods. In this paper we establish some basic results that make it possible the generalization to treechild time consistent (TCTC) hybridization networks of some of the oldest known metrics for phylogenetic trees: those based on the comparison of the vectors of path lengths between leaves. More specifically, we associate to each hybridization network a suitably defined vector of 'splitted' path lengths between its leaves, and we prove that if two TCTC hybridization networks have the same such vectors, then they must be isomorphic. Thus, comparing these vectors by means of a metric for realvalued vectors defines a metric for TCTC hybridization networks. We also consider the case of fully resolved hybridization networks, where we prove that simpler, 'nonsplitted' vectors can be used. © 2009 Elsevier Inc. All rights reserved."



Stephen J. Willson. Regular Networks Can Be Uniquely Constructed from Their Trees. In TCBB, Vol. 8(3):785796, 2010. Keywords: explicit network, from rooted trees, phylogenetic network, phylogeny, reconstruction, regular network. Note: http://www.public.iastate.edu/~swillson/RegularNetsFromTrees5.pdf.
Toggle abstract
"A rooted acyclic digraph N with labeled leaves displays a tree T when there exists a way to select a unique parent of each hybrid vertex resulting in the tree T. Let Tr(N) denote the set of all trees displayed by the network N. In general, there may be many other networks M, such that Tr(M) = Tr(N). A network is regular if it is isomorphic with its cover digraph. If N is regular and D is a collection of trees displayed by N, this paper studies some procedures to try to reconstruct N given D. If the input is D=Tr(N), one procedure is described, which will reconstruct N. Hence, if N and M are regular networks and Tr(N) = Tr(M), it follows that N = M, proving that a regular network is uniquely determined by its displayed trees. If D is a (usually very much smaller) collection of displayed trees that satisfies certain hypotheses, modifications of the procedure will still reconstruct N given D. © 2011 IEEE."



Stephen J. Willson. Properties of normal phylogenetic networks. In BMB, Vol. 72(2):340358, 2010. Keywords: normal network, phylogenetic network, phylogeny, regular network. Note: http://www.public.iastate.edu/~swillson/RestrictionsOnNetworkspap9.pdf, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/willson/.
Toggle abstract
"A phylogenetic network is a rooted acyclic digraph with vertices corresponding to taxa. Let X denote a set of vertices containing the root, the leaves, and all vertices of outdegree 1. Regard X as the set of vertices on which measurements such as DNA can be made. A vertex is called normal if it has one parent, and hybrid if it has more than one parent. The network is called normal if it has no redundant arcs and also from every vertex there is a directed path to a member of X such that all vertices after the first are normal. This paper studies properties of normal networks. Under a simple model of inheritance that allows homoplasies only at hybrid vertices, there is essentially unique determination of the genomes at all vertices by the genomes at members of X if and only if the network is normal. This model is a limiting case of more standard models of inheritance when the substitution rate is sufficiently low. Various mathematical properties of normal networks are described. These properties include that the number of vertices grows at most quadratically with the number of leaves and that the number of hybrid vertices grows at most linearly with the number of leaves. © 2009 Society for Mathematical Biology."



Leo van Iersel,
Steven Kelk,
Regula Rupp and
Daniel H. Huson. Phylogenetic Networks Do not Need to Be Complex: Using Fewer Reticulations to Represent Conflicting Clusters. In ISMB10, Vol. 26(12):i124i131 of BIO, 2010. Keywords: from clusters, level k phylogenetic network, Program Dendroscope, Program HybridInterleave, Program HybridNumber, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btq202, with proofs: http://arxiv.org/abs/0910.3082.
Toggle abstract
"Phylogenetic trees are widely used to display estimates of how groups of species are evolved. Each phylogenetic tree can be seen as a collection of clusters, subgroups of the species that evolved from a common ancestor. When phylogenetic trees are obtained for several datasets (e.g. for different genes), then their clusters are often contradicting. Consequently, the set of all clusters of such a dataset cannot be combined into a single phylogenetic tree. Phylogenetic networks are a generalization of phylogenetic trees that can be used to display more complex evolutionary histories, including reticulate events, such as hybridizations, recombinations and horizontal gene transfers. Here, we present the new CASS algorithm that can combine any set of clusters into a phylogenetic network. We show that the networks constructed by CASS are usually simpler than networks constructed by other available methods. Moreover, we show that CASS is guaranteed to produce a network with at most two reticulations per biconnected component, whenever such a network exists. We have implemented CASS and integrated it into the freely available Dendroscope software. Contact: l.j.j.v.iersel@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online. © The Author(s) 2010. Published by Oxford University Press."



Simone Linz,
Charles Semple and
Tanja Stadler. Analyzing and reconstructing reticulation networks under timing constraints. In JOMB, Vol. 61(5):715737, 2010. Keywords: explicit network, from rooted trees, hybridization, lateral gene transfer, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://dx.doi.org/10.1007/s002850090319y..
Toggle abstract
"Reticulation networks are now frequently used to model the history of life for various groups of species whose evolutionary past is likely to include reticulation events such as horizontal gene transfer or hybridization. However, the reconstructed networks are rarely guaranteed to be temporal. If a reticulation network is temporal, then it satisfies the two biologically motivated timing constraints of instantaneously occurring reticulation events and successively occurring speciation events. On the other hand, if a reticulation network is not temporal, it is always possible to make it temporal by adding a number of additional unsampled or extinct taxa. In the first half of the paper, we show that deciding whether a given number of additional taxa is sufficient to transform a nontemporal reticulation network into a temporal one is an NPcomplete problem. As one is often given a set of gene trees instead of a network in the context of hybridization, this motivates the second half of the paper which provides an algorithm, called TemporalHybrid, for reconstructing a temporal hybridization network that simultaneously explains the ancestral history of two trees or indicates that no such network exists. We further derive two methods to decide whether or not a temporal hybridization network exists for two given trees and illustrate one of the methods on a grass data set. © 2009 The Author(s)."



Tetsuo Asano,
Jesper Jansson,
Kunihiko Sadakane,
Ryuhei Uehara and
Gabriel Valiente. Faster Computation of the RobinsonFoulds Distance between Phylogenetic Networks. In CPM10, Vol. 6129:190201 of LNCS, springer, 2010. Keywords: distance between networks, explicit network, level k phylogenetic network, phylogenetic network, polynomial, spread. Note: http://hdl.handle.net/10119/9859, slides available at http://cs.nyu.edu/parida/CPM2010/MainPage_files/18.pdf.
Toggle abstract
"The RobinsonFoulds distance, which is the most widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two networks N1,N2 with n leaves, m nodes, and e edges, the RobinsonFoulds distance measures the number of clusters of descendant leaves that are not shared by N1 and N2. The fastest known algorithm for computing the RobinsonFoulds distance between those networks runs in O(m(m + e)) time. In this paper, we improve the time complexity to O(n(m+ e)/ log n) for general networks and O(nm/log n) for general networks with bounded degree, and to optimal O(m + e) time for planar phylogenetic networks and boundedlevel phylogenetic networks.We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a levelk phylogenetic network is at most k + 1, which implies that for two levelk phylogenetic networks, our algorithm runs in O((k + 1)(m + e)) time. © SpringerVerlag Berlin Heidelberg 2010."



Yufeng Wu and
Jiayin Wang. Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees. In ISBRA10, Vol. 6053:203214 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, from rooted trees, hybridization, integer linear programming, minimum number, phylogenetic network, phylogeny, Program HybridNumber, Program SPRDist, SPR distance. Note: http://www.engr.uconn.edu/~ywu/Papers/ISBRA10WuWang.pdf.
Toggle abstract
"Hybridization is a reticulate evolutionary process. An established problem on hybridization is computing the minimum number of hybridization events, called the hybridization number, needed in the evolutionary history of two phylogenetic trees. This problem is known to be NPhard. In this paper, we present a new practical method to compute the exact hybridization number. Our approach is based on an integer linear programming formulation. Simulation results on biological and simulated datasets show that our method (as implemented in program SPRDist) is more efficient and robust than an existing method. © 2010 SpringerVerlag Berlin Heidelberg."



Miguel Arenas,
Mateus Patricio,
David Posada and
Gabriel Valiente. Characterization of Phylogenetic Networks with NetTest. In BMCB, Vol. 11:268, 2010. Keywords: explicit network, galled tree, phylogenetic network, Program NetTest, software, time consistent network, tree child network, tree sibling network, visualization. Note: http://dx.doi.org/10.1186/1471210511268, software available at http://darwin.uvigo.es/software/nettest/.
Toggle abstract
"Background: Typical evolutionary events like recombination, hybridization or gene transfer make necessary the use of phylogenetic networks to properly depict the evolution of DNA and protein sequences. Although several theoretical classes have been proposed to characterize these networks, they make stringent assumptions that will likely not be met by the evolutionary process. We have recently shown that the complexity of simulated networks is a function of the population recombination rate, and that at moderate and large recombination rates the resulting networks cannot be categorized. However, we do not know whether these results extend to networks estimated from real data.Results: We introduce a web server for the categorization of explicit phylogenetic networks, including the most relevant theoretical classes developed so far. Using this tool, we analyzed statistical parsimony phylogenetic networks estimated from ~5,000 DNA alignments, obtained from the NCBI PopSet and Polymorphix databases. The level of characterization was correlated to nucleotide diversity, and a high proportion of the networks derived from these data sets could be formally characterized.Conclusions: We have developed a public web server, NetTest (freely available from the software section at http://darwin.uvigo.es), to formally characterize the complexity of phylogenetic networks. Using NetTest we found that most statistical parsimony networks estimated with the program TCS could be assigned to a known network class. The level of network characterization was correlated to nucleotide diversity and dependent upon the intra/interspecific levels, although no significant differences were detected among genes. More research on the properties of phylogenetic networks is clearly needed. © 2010 Arenas et al; licensee BioMed Central Ltd."



Leo van Iersel,
Charles Semple and
Mike Steel. Locating a tree in a phylogenetic network. In IPL, Vol. 110(23), 2010. Keywords: cluster containment, explicit network, from network, level k phylogenetic network, normal network, NP complete, phylogenetic network, polynomial, regular network, time consistent network, tree child network, tree containment, tree sibling network. Note: http://arxiv.org/abs/1006.3122.
Toggle abstract
"Phylogenetic trees and networks are leaflabelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NPcomplete in general. In this article, we consider the restriction of these problems to several wellstudied classes of phylogenetic networks. We show that Tree Containment is polynomialtime solvable for normal networks, for binary treechild networks, and for levelk networks. On the other hand, we show that, even for treesibling, timeconsistent, regular networks, both Tree Containment and Cluster Containment remain NPcomplete. © 2010 Elsevier B.V. All rights reserved."



Sophie Abby,
Eric Tannier,
Manolo Gouy and
Vincent Daubin. Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests. In BMCB, Vol. 11:324, 2010. Keywords: explicit network, from rooted trees, from species tree, heuristic, lateral gene transfer, phylogenetic network, phylogeny, Program EEEP, Program PhyloNet, Program Prunier, reconstruction, software. Note: http://www.biomedcentral.com/14712105/11/324.
Toggle abstract
"Background: To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is computationally hard. In addition, the application of these methods to real data raises the problem of sorting out real and artifactual phylogenetic conflict.Results: We present Prunier, a new method for phylogenetic detection of LGT based on the search for a maximum statistical agreement forest (MSAF) between a gene tree and a reference tree. The program is flexible as it can use any definition of "agreement" among trees. We evaluate the performance of Prunier and two other programs (EEEP and RIATAHGT) for their ability to detect transferred genes in realistic simulations where gene trees are reconstructed from sequences. Prunier proposes a single scenario that compares to the other methods in terms of sensitivity, but shows higher specificity. We show that LGT scenarios carry a strong signal about the position of the root of the species tree and could be used to identify the direction of evolutionary time on the species tree. We use Prunier on a biological dataset of 23 universal proteins and discuss their suitability for inferring the tree of life.Conclusions: The ability of Prunier to take into account branch support in the process of reconciliation allows a gain in complexity, in comparison to EEEP, and in accuracy in comparison to RIATAHGT. Prunier's greedy algorithm proposes a single scenario of LGT for a gene family, but its quality always compares to the best solutions provided by the other algorithms. When the root position is uncertain in the species tree, Prunier is able to infer a scenario per root at a limited additional computational cost and can easily run on large datasets.Prunier is implemented in C++, using the Bio++ library and the phylogeny program Treefinder. It is available at: http://pbil.univlyon1.fr/software/prunier. © 2010 Abby et al; licensee BioMed Central Ltd."





Chris Whidden,
Robert G. Beiko and
Norbert Zeh. Fast FPT Algorithms for Computing Rooted Agreement Forests: Theory and Experiments. In Proceedings of the ninth International Symposium on Experimental Algorithms (SEA'10), Vol. 6049:141153 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: https://www.cs.dal.ca/sites/default/files/technical_reports/CS201003.pdf.
Toggle abstract
"We improve on earlier FPT algorithms for computing a rooted maximum agreement forest (MAF) or a maximum acyclic agreement forest (MAAF) of a pair of phylogenetic trees. Their sizes give the subtreepruneandregraft (SPR) distance and the hybridization number of the trees, respectively. We introduce new branching rules that reduce the running time of the algorithms from O(3 kn) and O(3 kn log n) to O(2.42 kn) and O(2.42 kn log n), respectively. In practice, the speed up may be much more than predicted by the worstcase analysis.We confirm this intuition experimentally by computing MAFs for simulated trees and trees inferred from protein sequence data. We show that our algorithm is orders of magnitude faster and can handle much larger trees and SPR distances than the best previous methods, treeSAT and sprdist. © SpringerVerlag Berlin Heidelberg 2010."





Marta Melé,
Asif Javed,
Marc Pybus,
Francesc Calafell,
Laxmi Parida,
Jaume Bertranpetit and
Genographic Consortium. A New Method to Reconstruct Recombination Events at a Genomic Scale. In PLoS Computational Biology, Vol. 6(11):e1001010, 2010. Keywords: explicit network, from sequences, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1371/journal.pcbi.1001010.
Toggle abstract
"Recombination is one of the main forces shaping genome diversity, but the information it generates is often overlooked. A recombination event creates a junction between two parental sequences that may be transmitted to the subsequent generations. Just like mutations, these junctions carry evidence of the shared past of the sequences. We present the IRiS algorithm, which detects past recombination events from extant sequences and specifies the place of each recombination and which are the recombinants sequences. We have validated and calibrated IRiS for the human genome using coalescent simulations replicating standard human demographic history and a variable recombination rate model, and we have finetuned IRiS parameters to simultaneously optimize for false discovery rate, sensitivity, and accuracy in placing the recombination events in the sequence. Newer recombinations overwrite traces of past ones and our results indicate more recent recombinations are detected by IRiS with greater sensitivity. IRiS analysis of the MS32 region, previously studied using sperm typing, showed good concordance with estimated recombination rates. We also applied IRiS to haplotypes for 18 Xchromosome regions in HapMap Phase 3 populations. Recombination events detected for each individual were recoded as binary allelic states and combined into recotypes. Principal component analysis and multidimensional scaling based on recotypes reproduced the relationships between the eleven HapMap Phase III populations that can be expected from known human population history, thus further validating IRiS. We believe that our new method will contribute to the study of the distribution of recombination events across the genomes and, for the first time, it will allow the use of recombination as genetic marker to study human genetic variation. © 2010 Mele ́ et al."



Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. Comparison of treechild phylogenetic networks. In TCBB, Vol. 6(4):552569, 2009. Keywords: explicit network, phylogenetic network, phylogeny, Program Bio PhyloNetwork, Program PhyloNetwork, tree child network, tree sibling network. Note: http://arxiv.org/abs/0708.3499.
Toggle abstract
"Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a wellfounded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called treechild phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the wellknown RobinsonFoulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a treechild phylogenetic network from its path multiplicity vectors, for computing the distance between two treechild phylogenetic networks and for aligning a pair of treechild phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at http://bioinfo.uib.es/~recerca/ phylonetworks/mudistance/. © 2009 IEEE."



Ulrik Brandes and
Sabine Cornelsen. Phylogenetic Graph Models Beyond Trees. In DAM, Vol. 157(10):23612369, 2009. Keywords: abstract network, cactus graph, from splits, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.inf.unikonstanz.de/~cornelse/Papers/bcpgmbt07.pdf.
Toggle abstract
"A graph model for a set S of splits of a set X consists of a graph and a map from X to the vertices of the graph such that the inclusionminimal cuts of the graph represent S. Phylogenetic trees are graph models in which the graph is a tree. We show that the model can be generalized to a cactus (i.e. a tree of edges and cycles) without losing computational efficiency. A cactus can represent a quadratic rather than linear number of splits in linear space. We show how to decide in linear time in the size of a succinct representation of S whether a set of splits has a cactus model, and if so construct it within the same time bounds. As a byproduct, we show how to construct the subset of all compatible splits and a maximal compatible set of splits in linear time. Note that it is N Pcomplete to find a compatible subset of maximum size. Finally, we briefly discuss further generalizations of tree models. © 2008 Elsevier B.V. All rights reserved."



Leo van Iersel,
Steven Kelk and
Matthias Mnich. Uniqueness, intractability and exact algorithms: reflections on levelk phylogenetic networks. In JBCB, Vol. 7(4):597623, 2009. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, reconstruction, uniqueness. Note: http://arxiv.org/pdf/0712.2932v2.





Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Metrics for phylogenetic networks I: Generalizations of the RobinsonFoulds metric. In TCBB, Vol. 6(1):4661, 2009. Keywords: distance between networks, explicit network, phylogenetic network, phylogeny, time consistent network, tree child network, tripartition distance. Note: http://dx.doi.org/10.1109/TCBB.2008.70.
Toggle abstract
"The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the first in a series of papers devoted to the analysis and comparison of metrics for treechild time consistent phylogenetic networks on the same set of taxa. In this paper, we study three metrics that have already been introduced in the literature: the RobinsonFoulds distance, the tripartitions distance and the $mu$distance. They generalize to networks the classical RobinsonFoulds or partition distance for phylogenetic trees. We analyze the behavior of these metrics by studying their least and largest values and when they achieve them. As a byproduct of this study, we obtain tight bounds on the size of a treechild time consistent phylogenetic network. © 2006 IEEE."



Leo van Iersel. Algorithms, Haplotypes and Phylogenetic Networks. PhD thesis, Eindhoven University of Technology, The Netherlands, 2009. Keywords: evaluation, explicit network, exponential algorithm, FPT, from triplets, galled tree, level k phylogenetic network, mu distance, phylogenetic network, phylogeny, polynomial, Program Level2, Program Marlon, Program Simplistic, Program T REX, reconstruction. Note: http://www.win.tue.nl/~liersel/thesis_vaniersel_viewing.pdf.



ThuHien To and
Michel Habib. Levelk Phylogenetic Networks Are Constructable from a Dense Triplet Set in Polynomial Time. In CPM09, (5577):275288, springer, 2009. Keywords: explicit network, from triplets, level k phylogenetic network, minimum number, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/0901.1657.
Toggle abstract
"For a given dense triplet set Τ there exist two natural questions [7]: Does there exist any phylogenetic network consistent with Τ? In case such networks exist, can we find an effective algorithm to construct one? For cases of networks of levels k = 0, 1 or 2, these questions were answered in [1,6,7,8,10] with effective polynomial algorithms. For higher levels k, partial answers were recently obtained in [11] with an O(/Τ/k+1)time algorithm for simple networks. In this paper, we give a complete answer to the general case, solving a problem proposed in [7]. The main idea of our proof is to use a special property of SNsets in a levelk network. As a consequence, for any fixed k, we can also find a levelk network with the minimum number of reticulations, if one exists, in polynomial time. © 2009 Springer Berlin Heidelberg."



Philippe Gambette,
Vincent Berry and
Christophe Paul. The structure of levelk phylogenetic networks. In CPM09, Vol. 5577:289300 of LNCS, springer, 2009. Keywords: coalescent, explicit network, galled tree, level k phylogenetic network, phylogenetic network, Program Recodon. Note: http://hallirmm.ccsd.cnrs.fr/lirmm00371485/en/.
Toggle abstract
"Evolution is usually described as a phylogenetic tree, but due to some exchange of genetic material, it can be represented as a phylogenetic network which has an underlying tree structure. The notion of level was recently introduced as a parameter on realistic kinds of phylogenetic networks to express their complexity and treelikeness. We study the structure of levelk networks, and how they can be decomposed into levelk generators. We also provide a polynomial time algorithm which takes as input the set of levelk generators and builds the set of level(k + 1) generators. Finally, with a simulation study, we evaluate the proportion of levelk phylogenetic networks among networks generated according to the coalescent model with recombination. © 2009 Springer Berlin Heidelberg."



Daniel H. Huson,
Regula Rupp,
Vincent Berry,
Philippe Gambette and
Christophe Paul. Computing Galled Networks from Real Data. In ISMBECCB09, Vol. 25(12):i85i93 of BIO, 2009. Keywords: abstract network, cluster containment, explicit network, FPT, from clusters, from rooted trees, galled network, NP complete, phylogenetic network, phylogeny, polynomial, Program Dendroscope, reconstruction. Note: http://hallirmm.ccsd.cnrs.fr/lirmm00368545/en/.
Toggle abstract
"Motivation: Developing methods for computing phylogenetic networks from biological data is an important problem posed by molecular evolution and much work is currently being undertaken in this area. Although promising approaches exist, there are no tools available that biologists could easily and routinely use to compute rooted phylogenetic networks on real datasets containing tens or hundreds of taxa. Biologists are interested in clades, i.e. groups of monophyletic taxa, and these are usually represented by clusters in a rooted phylogenetic tree. The problem of computing an optimal rooted phylogenetic network from a set of clusters, is hard, in general. Indeed, even the problem of just determining whether a given network contains a given cluster is hard. Hence, some researchers have focused on topologically restricted classes of networks, such as galled trees and levelk networks, that are more tractable, but have the practical drawback that a given set of clusters will usually not possess such a representation. Results: In this article, we argue that galled networks (a generalization of galled trees) provide a good tradeoff between level of generality and tractability. Any set of clusters can be represented by some galled network and the question whether a cluster is contained in such a network is easy to solve. Although the computation of an optimal galled network involves successively solving instances of two different NPcomplete problems, in practice our algorithm solves this problem exactly on large datasets containing hundreds of taxa and many reticulations in seconds, as illustrated by a dataset containing 279 prokaryotes. © 2009 The Author(s)."





Josh Voorkamp né Collins. Rekernelisation Algorithms in Hybrid Phylogenies. Master's thesis, University of Canterbury, New Zealand, 2009. Keywords: agreement forest, explicit network, FPT, from rooted trees, from unrooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, software. Note: http://hdl.handle.net/10092/2852.





Gabriel Valiente. Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R. Pages 184208, Taylor & Francis/CRC Press, 2009. Keywords: counting, distance between networks, galled tree, generation, phylogenetic network, phylogeny, survey, time consistent network, tree child network, tree sibling network. Note: http://books.google.fr/books?id=F4YIIUWb7yMC.



Chris Whidden. A Unifying View on Approximation and FPT of Agreement Forests. Master's thesis, Dalhousie University, Canada, 2009. Keywords: agreement forest, approximation, explicit network, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, reconstruction, SPR distance. Note: http://web.cs.dal.ca/~whidden/MCSThesis09.pdf.



Chris Whidden and
Norbert Zeh. A Unifying View on Approximation and FPT of Agreement Forests. In WABI09, Vol. 5724:390402 of LNCS, Springer, 2009. Keywords: agreement forest, approximation, explicit network, FPT, minimum number, phylogenetic network, phylogeny, reconstruction. Note: https://www.cs.dal.ca/sites/default/files/technical_reports/CS200902.pdf.
Toggle abstract
"We provide a unifying view on the structure of maximum (acyclic) agreement forests of rooted and unrooted phylogenies. This enables us to obtain linear or O(n log n)time 3approximation and improved fixedparameter algorithms for the subtree prune and regraft distance between two rooted phylogenies, the tree bisection and reconnection distance between two unrooted phylogenies, and the hybridization number of two rooted phylogenies. © 2009 Springer Berlin Heidelberg."



Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. Tripartitions do not always discriminate phylogenetic networks. In MBIO, Vol. 211(2):356370, 2008. Keywords: distance between networks, phylogenetic network, phylogeny, Program Bio PhyloNetwork, tree child network, tripartition distance. Note: http://arxiv.org/abs/0707.2376, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/valiente/.
Toggle abstract
"Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. In a recent series of papers devoted to the study of reconstructibility of phylogenetic networks, Moret, Nakhleh, Warnow and collaborators introduced the socalled tripartition metric for phylogenetic networks. In this paper we show that, in fact, this tripartition metric does not satisfy the separation axiom of distances (zero distance means isomorphism, or, in a more relaxed version, zero distance means indistinguishability in some specific sense) in any of the subclasses of phylogenetic networks where it is claimed to do so. We also present a subclass of phylogenetic networks whose members can be singled out by means of their sets of tripartitions (or even clusters), and hence where the latter can be used to define a meaningful metric. © 2007 Elsevier Inc. All rights reserved."



Leo van Iersel,
Judith Keijsper,
Steven Kelk,
Leen Stougie,
Ferry Hagen and
Teun Boekhout. Constructing level2 phylogenetic networks from triplets. In RECOMB08, Vol. 4955:450462 of LNCS, springer, 2008. Keywords: explicit network, from triplets, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, polynomial, Program Level2, reconstruction. Note: http://homepages.cwi.nl/~iersel/level2full.pdf. An appendix with proofs can be found here http://arxiv.org/abs/0707.2890.
Toggle abstract
"Jansson and Sung showed that, given a dense set of input triplets T (representing hypotheses about the local evolutionary relationships of triplets of taxa), it is possible to determine in polynomial time whether there exists a level1 network consistent with T, and if so, to construct such a network [24]. Here, we extend this work by showing that this problem is even polynomial time solvable for the construction of level2 networks. This shows that, assuming density, it is tractable to construct plausible evolutionary histories from input triplets even when such histories are heavily nontreelike. This further strengthens the case for the use of tripletbased methods in the construction of phylogenetic networks. We also implemented the algorithm and applied it to yeast data. © 2009 IEEE."



Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. A Perl Package and an Alignment Tool for Phylogenetic Networks. In BMCB, Vol. 9:175, 2008. Keywords: distance between networks, phylogenetic network, phylogeny, Program Bio PhyloNetwork, tree child network, tree sibling network. Note: http://dx.doi.org/10.1186/147121059175.
Toggle abstract
"Background: Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of evolutionary events acting at the population level, like recombination between genes, hybridization between lineages, and lateral gene transfer. While most phylogenetics tools implement a wide range of algorithms on phylogenetic trees, there exist only a few applications to work with phylogenetic networks, none of which are opensource libraries, and they do not allow for the comparative analysis of phylogenetic networks by computing distances between them or aligning them. Results: In order to improve this situation, we have developed a Perl package that relies on the BioPerl bundle and implements many algorithms on phylogenetic networks. We have also developed a Java applet that makes use of the aforementioned Perl package and allows the user to make simple experiments with phylogenetic networks without having to develop a program or Perl script by him or herself. Conclusion: The Perl package is available as part of the BioPerl bundle, and can also be downloaded. A webbased application is also available (see availability and requirements). The Perl package includes full documentation of all its features. © 2008 Cardona et al; licensee BioMed Central Ltd."



Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. A Distance Metric for a Class of TreeSibling Phylogenetic Networks. In BIO, Vol. 24(13):14811488, 2008. Keywords: distance between networks, phylogenetic network, phylogeny, polynomial, tree sibling network. Note: http://dx.doi.org/10.1093/bioinformatics/btn231.
Toggle abstract
"Motivation: The presence of reticulate evolutionary events in phylogenies turn phylogenetic trees into phylogenetic networks. These events imply in particular that there may exist multiple evolutionary paths from a nonextant species to an extant one, and this multiplicity makes the comparison of phylogenetic networks much more difficult than the comparison of phylogenetic trees. In fact, all attempts to define a sound distance measure on the class of all phylogenetic networks have failed so far. Thus, the only practical solutions have been either the use of rough estimates of similarity (based on comparison of the trees embedded in the networks), or narrowing the class of phylogenetic networks to a certain class where such a distance is known and can be efficiently computed. The first approach has the problem that one may identify two networks as equivalent, when they are not; the second one has the drawback that there may not exist algorithms to reconstruct such networks from biological sequences. Results: We present in this articlea distance measure on the class of semibinary treesibling time consistent phylogenetic networks, which generalize treechild time consistent phylogenetic networks, and thus also galledtrees. The practical interest of this distance measure is 2fold: it can be computed in polynomial time by means of simple algorithms, and there also exist polynomialtime algorithms for reconstructing networks of this class from DNA sequence data. © 2008 The Author(s)."



Simone Linz. Reticulation in evolution. PhD thesis, HeinrichHeineUniversity, Düsseldorf, Germany, 2008. Keywords: agreement forest, FPT, from rooted trees, lateral gene transfer, phylogenetic network, phylogeny, SPR distance, statistical model. Note: http://docserv.uniduesseldorf.de/servlets/DocumentServlet?id=8505.



Leo van Iersel and
Steven Kelk. Constructing the Simplest Possible Phylogenetic Network from Triplets. In ISAAC08, Vol. 5369:472483 of LNCS, springer, 2008. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, minimum number, phylogenetic network, phylogeny, polynomial, Program Marlon, Program Simplistic. Note: http://arxiv.org/abs/0805.1859.





Cuong Than and
Luay Nakhleh. SPRbased Tree Reconciliation: Nonbinary Trees and Multiple Solutions. In APBC08, Pages 251260, 2008. Keywords: evaluation, from rooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program LatTrans, Program PhyloNet, reconstruction, SPR distance. Note: http://www.cs.rice.edu/~nakhleh/Papers/apbc08.pdf.





Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Phylogenetic Networks: Justification, Models, Distances and Algorithms. In VI Jornadas de Matemática Discreta y Algorítmica (JMDA'08), 2008. Keywords: distance between networks, mu distance, phylogenetic network, phylogeny, polynomial, survey, time consistent network, tree child network, tripartition distance, triplet distance. Note: http://bioinfo.uib.es/media/uploaded/jmda2008_submission_611.pdf.



Cuong Than,
Derek Ruths and
Luay Nakhleh. PhyloNet: A Software Package for Analyzing and Reconstructing Reticulate Evolutionary Relationships. In BMCB, Vol. 9(322), 2008. Keywords: Program PhyloNet, software. Note: http://dx.doi.org/10.1186/147121059322.
Toggle abstract
"Background: Phylogenies, i.e., the evolutionary histories of groups of taxa, play a major role in representing the interrelationships among biological entities. Many software tools for reconstructing and evaluating such phylogenies have been proposed, almost all of which assume the underlying evolutionary history to be a tree. While trees give a satisfactory firstorder approximation for many families of organisms, other families exhibit evolutionary mechanisms that cannot be represented by trees. Processes such as horizontal gene transfer (HGT), hybrid speciation, and interspecific recombination, collectively referred to as reticulate evolutionary events, result in networks, rather than trees, of relationships. Various software tools have been recently developed to analyze reticulate evolutionary relationships, which include SplitsTree4, LatTrans, EEEP, HorizStory, and TREX. Results: In this paper, we report on the PhyloNet software package, which is a suite of tools for analyzing reticulate evolutionary relationships, or evolutionary networks, which are rooted, directed, acyclic graphs, leaflabeled by a set of taxa. These tools can be classified into four categories: (1) evolutionary network representation: reading/writing evolutionary networks in a newly devised compact form; (2) evolutionary network characterization: analyzing evolutionary networks in terms of three basic building blocks  trees, clusters, and tripartitions; (3) evolutionary network comparison: comparing two evolutionary networks in terms of topological dissimilarities, as well as fitness to sequence evolution under a maximum parsimony criterion; and (4) evolutionary network reconstruction: reconstructing an evolutionary network from a species tree and a set of gene trees. Conclusion: The software package, PhyloNet, offers an array of utilities to allow for efficient and accurate analysis of evolutionary networks. The software package will help significantly in analyzing large data sets, as well as in studying the performance of evolutionary network reconstruction methods. Further, the software package supports the proposed eNewick format for compact representation of evolutionary networks, a feature that allows for efficient interoperability of evolutionary network software tools. Currently, all utilities in PhyloNet are invoked on the command line. © 2008 Than et al; licensee BioMed Central Ltd."



Ernst Althaus and
Rouven Naujoks. Reconstructing Phylogenetic Networks with One Recombination. In Proceedings of the seventh International Workshop on Experimental Algorithms (WEA'08), Vol. 5038:275288 of LNCS, springer, 2008. Keywords: enumeration, explicit network, exponential algorithm, from sequences, generation, parsimony, phylogenetic network, phylogeny, reconstruction, unicyclic network. Note: http://dx.doi.org/10.1007/9783540685524_21.
Toggle abstract
"In this paper we propose a new method for reconstructing phylogenetic networks under the assumption that recombination events have occurred rarely. For a fixed number of recombinations, we give a generalization of the maximum parsimony criterion. Furthermore, we describe an exact algorithm for one recombination event and show that in this case our method is not only able to identify the recombined sequence but also to reliably reconstruct the complete evolutionary history. © 2008 SpringerVerlag Berlin Heidelberg."



Miguel Arenas,
Gabriel Valiente and
David Posada. Characterization of reticulate networks based on the coalescent with recombination. In MBE, Vol. 25(12):25172520, 2008. Keywords: coalescent, evaluation, explicit network, galled tree, phylogenetic network, phylogeny, Program Recodon, regular network, simulation, tree child network, tree sibling network. Note: http://dx.doi.org/10.1093/molbev/msn219.
Toggle abstract
"Phylogenetic networks aim to represent the evolutionary history of taxa. Within these, reticulate networks are explicitly able to accommodate evolutionary events like recombination, hybridization, or lateral gene transfer. Although several metrics exist to compare phylogenetic networks, they make several assumptions regarding the nature of the networks that are not likely to be fulfilled by the evolutionary process. In order to characterize the potential disagreement between the algorithms and the biology, we have used the coalescent with recombination to build the type of networks produced by reticulate evolution and classified them as regular, tree sibling, tree child, or galled trees. We show that, as expected, the complexity of these reticulate networks is a function of the population recombination rate. At small recombination rates, most of the networks produced are already more complex than regular or tree sibling networks, whereas with moderate and large recombination rates, no network fit into any of the standard classes. We conclude that new metrics still need to be devised in order to properly compare two phylogenetic networks that have arisen from reticulating evolutionary process. © 2008 The Authors."



Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. Extended Newick: It is Time for a Standard Representation. In BMCB, Vol. 9:532, 2008. Keywords: evaluation, explicit network, phylogenetic network, Program Bio PhyloNetwork, Program Dendroscope, Program NetGen, Program PhyloNet, Program SplitsTree, Program TCS, visualization. Note: http://bioinfo.uib.es/media/uploaded/bmc2008enewicksub.pdf.



Cuong Than,
Guohua Jin and
Luay Nakhleh. Integrating Sequence and Topology for Efficient and Accurate Detection of Horizontal Gene Transfer. In Proceedings of the Sixth RECOMB Comparative Genomics Satellite Workshop (RECOMBCG'08), Vol. 5267:113127 of LNCS, springer, 2008. Keywords: bootstrap, explicit network, from rooted trees, from sequences, lateral gene transfer, phylogenetic network, phylogeny, Program Nepal, Program PhyloNet, reconstruction. Note: http://www.cs.rice.edu/~nakhleh/Papers/recombcg08.pdf, slides available at http://igm.univmlv.fr/RCG08/RCG08slides/Cuong_Than_RCG08.pdf.



Magnus Bordewich,
Simone Linz,
Katherine St. John and
Charles Semple. A reduction algorithm for computing the hybridization number of two trees. In EBIO, Vol. 3:8698, 2007. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNumber. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BLSS07.pdf.



Magnus Bordewich and
Charles Semple. Computing the minimum number of hybridization events for a consistent evolutionary history. In DAM, Vol. 155:914918, 2007. Keywords: agreement forest, approximation, APX hard, explicit network, from rooted trees, hybridization, inapproximability, NP complete, phylogenetic network, phylogeny, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS06a.pdf.





Daniel H. Huson. Split networks and Reticulate Networks. In
Olivier Gascuel and
Mike Steel editors, Reconstructing Evolution, New Mathematical and Computational Advances, Pages 247276, Oxford University Press, 2007. Keywords: abstract network, consensus, from rooted trees, from sequences, from splits, from unrooted trees, galled tree, hybridization, phylogenetic network, phylogeny, Program Beagle, Program Spectronet, Program SplitsTree, Program SPNet, recombination, reconstruction, split network, survey. Note: similar to http://wwwab.informatik.unituebingen.de/research/phylonets/GCB2006.pdf.



Daniel H. Huson and
Tobias Kloepper. Beyond Galled Trees  Decomposition and Computation of Galled Networks. In RECOMB07, Vol. 4453:211225 of LNCS, springer, 2007. Keywords: FPT, from splits, from trees, galled network, phylogenetic network, phylogeny, Program SplitsTree, reconstruction. Note: http://dx.doi.org/10.1007/9783540716815_15, errata..







Cuong Than,
Derek Ruths,
Hideki Innan and
Luay Nakhleh. Confounding Factors in HGT Detection: Statistical Error, Coalescent Effects, and Multiple Solutions. In JCB, Vol. 14(4):517535, 2007. Keywords: enumeration, explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, Program LatTrans, Program PhyloNet. Note: http://www.cs.rice.edu/~nakhleh/Papers/recombcg06jcb.pdf.



Jesper Jansson and
WingKin Sung. Inferring a level1 phylogenetic network from a dense set of rooted triplets. In TCS, Vol. 363(1):6068, 2006. 1 comment Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/ipnrt8_TCS2006.pdf.
Toggle abstract
"We consider the following problem: Given a set T of rooted triplets with leaf set L, determine whether there exists a phylogenetic network consistent with T, and if so, construct one. We show that if no restrictions are placed on the hybrid nodes in the solution, the problem is trivially solved in polynomial time by a simple sorting networkbased construction. For the more interesting (and biologically more motivated) case where the solution is required to be a level1 phylogenetic network, we present an algorithm solving the problem in O ( T 2) time when T is dense, i.e., when T contains at least one rooted triplet for each cardinality three subset of L. We also give an O ( T 5 / 3)time algorithm for finding the set of all phylogenetic networks having a single hybrid node attached to exactly one leaf (and having no other hybrid nodes) that are consistent with a given dense set of rooted triplets. © 2006 Elsevier B.V. All rights reserved."





Cuong Than,
Derek Ruths,
Hideki Innan and
Luay Nakhleh. Identifiability Issues in PhylogenyBased Detection of Horizontal Gene Transfer. In Proceedings of the Fourth RECOMB Comparative Genomics Satellite Workshop (RECOMBCG'06), Vol. 4205:215229 of LNCS, springer, 2006. 1 comment Keywords: enumeration, explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, Program LatTrans, Program PhyloNet. Note: http://www.cs.rice.edu/~nakhleh/Papers/recombcg06final.pdf.



Robert G. Beiko and
Nicholas Hamilton. Phylogenetic identification of lateral genetic transfer events. In BMCEB, Vol. 6(15), 2006. Keywords: evaluation, from rooted trees, from unrooted trees, lateral gene transfer, Program EEEP, Program HorizStory, Program LatTrans, reconstruction, software, SPR distance. Note: http://dx.doi.org/10.1186/14712148615.
Toggle abstract
"Background: Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NPhard for comparisons between rooted trees, and may be so for unrooted trees as well. Results: Efficient Evaluation of Edit Paths (EEEP) is a new tree comparison algorithm that uses evolutionarily reasonable constraints to identify and eliminate many unproductive search avenues, reducing the time required to solve many edit path problems. The performance of EEEP compares favourably to that of other algorithms when applied to strictly bifurcating trees with specified numbers of SPR operations. We also used EEEP to recover edit paths from over 19 000 unrooted, incompletely resolved protein trees containing up to 144 taxa as part of a large phylogenomic study. While inferred protein trees were far more similar to a reference supertree than random trees were to each other, the phylogenetic distance spanned by random versus inferred transfer events was similar, suggesting that real transfer events occur most frequently between closely related organisms, but can span large phylogenetic distances as well. While most of the protein trees examined here were very similar to the reference supertree, requiring zero or one edit operations for reconciliation, some trees implied up to 40 transfer events within a single orthologous set of proteins. Conclusion: Since sequence trees typically have no implied root and may contain unresolved or multifurcating nodes, the strategy implemented in EEEP is the most appropriate for phylogenomic analyses. The high degree of consistency among inferred protein trees shows that vertical inheritance is the dominant pattern of evolution, at least for the set of organisms considered here. However, the edit paths inferred using EEEP suggest an important role for genetic transfer in the evolution of microbial genomes as well. © 2006Beiko and Hamilton; licensee BioMed Central Ltd."



Patricia Buendia and
Giri Narasimhan. Serial NetEvolve: A flexible utility for generating seriallysampled sequences along a tree or recombinant network. In BIO, Vol. 18(22):23132314, 2006. Keywords: generation, phylogenetic network, phylogeny, Program Serial NetEvolve, Program Treevolve, recombination, software. Note: http://dx.doi.org/10.1093/bioinformatics/btl387.
Toggle abstract
"Summary: Serial NetEvolve is a flexible simulation program that generates DNA sequences evolved along a tree or recombinant network. It offers a userfriendly Windows graphical interface and a Windows or Linux simulator with a diverse selection of parameters to control the evolutionary model. Serial NetEvolve is a modification of the Treevolve program with the following additional features: simulation of seriallysampled data, the choice of either a clocklike or a variable rate model of sequence evolution, sampling from the internal nodes and the output of the randomly generated tree or network in our newly proposed NeTwick format. © 2006 Oxford University Press."



Mihaela Baroni and
Mike Steel. Accumulation Phylogenies. In ACOM, Vol. 10(1):1930, 2006. Keywords: abstract network, from clusters, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, regular network. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.137.1960.
Toggle abstract
"We investigate the computational complexity of a new combinatorial problem of inferring a smallest possible multilabeled phylogenetic tree (MUL tree) which is consistent with each of the rooted triplets in a given set. We prove that even the restricted case of determining if there exists a MUL tree consistent with the input and having just one leaf duplication is NPhard. Furthermore, we show that the general minimization problem is NPhard to approximate within a ratio of n 1ε for any constant 0<ε≤1, where n denotes the number of distinct leaf labels in the input set, although a simple polynomialtime approximation algorithm achieves the approximation ratio n. We also provide an exact algorithm for the problem running in O *(7 n ) time and O *(3 n ) space. © 2009 SpringerVerlag Berlin Heidelberg."



Mihaela Baroni,
Stefan Grünewald,
Vincent Moulton and
Charles Semple. Bounding the number of hybridization events for a consistent evolutionary history. In JOMB, Vol. 51(2):171182, 2005. Keywords: agreement forest, bound, explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BGMS05.pdf.
Toggle abstract
"Evolutionary processes such as hybridisation, lateral gene transfer, and recombination are all key factors in shaping the structure of genes and genomes. However, since such processes are not always best represented by trees, there is now considerable interest in using more general networks instead. For example, in recent studies it has been shown that networks can be used to provide lower bounds on the number of recombination events and also for the number of lateral gene transfers that took place in the evolutionary history of a set of molecular sequences. In this paper we describe the theoretical performance of some related bounds that result when merging pairs of trees into networks. © SpringerVerlag 2005."



Magnus Bordewich and
Charles Semple. On the computational complexity of the rooted subtree prune and regraft distance. In ACOM, Vol. 8:409423, 2005. Keywords: agreement forest, from rooted trees, NP complete, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS04.pdf.
Toggle abstract
"The graphtheoretic operation of rooted subtree prune and regraft is increasingly being used as a tool for understanding and modelling reticulation events in evolutionary biology. In this paper, we show that computing the rooted subtree prune and regraft distance between two rooted binary phylogenetic trees on the same label set is NPhard. This resolves a longstanding open problem. Furthermore, we show that this distance is fixed parameter tractable when parameterised by the distance between the two trees."



Charles Choy,
Jesper Jansson,
Kunihiko Sadakane and
WingKin Sung. Computing the maximum agreement of phylogenetic networks. In TCS, Vol. 335(1):93107, 2005. Keywords: dynamic programming, FPT, level k phylogenetic network, MASN, NP complete, phylogenetic network, phylogeny. Note: http://www.df.lth.se/~jj/Publications/masn8_TCS2005.pdf.
Toggle abstract
"We introduce the maximum agreement phylogenetic subnetwork problem (MASN) for finding branching structure shared by a set of phylogenetic networks. We prove that the problem is NPhard even if restricted to three phylogenetic networks and give an O(n2)time algorithm for the special case of two level1 phylogenetic networks, where n is the number of leaves in the input networks and where N is called a levelf phylogenetic network if every biconnected component in the underlying undirected graph induces a subgraph of N containing at most f nodes with indegree 2. We also show how to extend our technique to yield a polynomialtime algorithm for any two levelf phylogenetic networks N1,N2 satisfying f=O(logn); more precisely, its running time is O(V(N1)·V(N2)·2f1+f2), where V(Ni) and fi denote the set of nodes in Ni and the level of Ni, respectively, for i∈{1,2}. © 2005 Elsevier B.V. All rights reserved."





David A. Morrison. Networks in phylogenetic analysis: new tools for population biology. In IJP, Vol. 35:567582, 2005. Keywords: median network, NeighborNet, phylogenetic network, phylogeny, population genetics, Program Network, Program Spectronet, Program SplitsTree, Program T REX, Program TCS, reconstruction, reticulogram, split decomposition, survey. Note: http://hem.fyristorg.com/acacia/papers/networks.pdf.



Luay Nakhleh and
LiSan Wang. Phylogenetic Networks, Trees, and Clusters. In IWBRA05, Vol. 3515:919926 of LNCS, springer, 2005. Keywords: cluster containment, evaluation, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, tree child network, tree containment. Note: http://www.cs.rice.edu/~nakhleh/Papers/NakhlehWang.pdf.



Luay Nakhleh and
LiSan Wang. Phylogenetic Networks: Properties and Relationship to Trees and Clusters. In TCSB2, Vol. 3680:8299 of LNCS, springer, 2005. Keywords: cluster containment, evaluation, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, tree child network, tree containment. Note: http://www.cs.rice.edu/~nakhleh/Papers/LNCS_TCSB05.pdf.





Mihaela Baroni,
Charles Semple and
Mike Steel. A framework for representing reticulate evolution. In ACOM, Vol. 8:398401, 2004. Keywords: explicit network, from clusters, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, regular network, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSS04.pdf.
Toggle abstract
"Acyclic directed graphs (ADGs) are increasingly being viewed as more appropriate for representing certain evolutionary relationships, particularly in biology, than rooted trees. In this paper, we develop a framework for the analysis of these graphs which we call hybrid phylogenies. We are particularly interested in the problem whereby one is given a set of phylogenetic trees and wishes to determine a hybrid phylogeny that 'embeds' each of these trees and which requires the smallest number of hybridisation events. We show that this quantity can be greatly reduced if additional species are involved, and investigate other combinatorial aspects of this and related questions."



Charles Choy,
Jesper Jansson,
Kunihiko Sadakane and
WingKin Sung. Computing the maximum agreement of phylogenetic networks. In Proceedings of Computing: the Tenth Australasian Theory Symposium (CATS'04), Vol. 91:134147 of Electronic Notes in Theoretical Computer Science, 2004. Keywords: dynamic programming, FPT, level k phylogenetic network, MASN, NP complete, phylogenetic network, phylogeny. Note: http://www.df.lth.se/~jj/Publications/masn6_CATS2004.pdf.
Toggle abstract
"We introduce the maximum agreement phylogenetic subnetwork problem (MASN) of finding a branching structure shared by a set of phylogenetic networks. We prove that the problem is NPhard even if restricted to three phylogenetic networks and give an O(n2)time algorithm for the special case of two level1 phylogenetic networks, where n is the number of leaves in the input networks and where N is called a levelf phylogenetic network if every biconnected component in the underlying undirected graph contains at most f nodes having indegree 2 in N. Our algorithm can be extended to yield a polynomialtime algorithm for two levelf phylogenetic networks N 1,N2 for any f which is upperbounded by a constant; more precisely, its running time is O(V(N1)·V(N 2)·4f), where V(Ni) denotes the set of nodes of Ni. © 2004 Published by Elsevier B.V."



Katharina Huber,
Michael Langton,
David Penny,
Vincent Moulton and
Mike Hendy. Spectronet: A package for computing spectra and median networks. In ABIO, Vol. 1(3):159161, 2004. Keywords: from splits, median network, phylogenetic network, phylogeny, Program Spectronet, software, split, visualization. Note: http://citeseer.ist.psu.edu/631776.html.
Toggle abstract
Spectronet is a package that uses various methods for exploring and visualising complex evolutionary signals. Given an alignment in NEXUS format, the package works by computing a collection of weighted splits or bipartitions of the taxa and then allows the user to interactively analyse the resulting collection using tools such as Lentoplots and median networks. The package is highly interactive and available for PCs.



Jesper Jansson and
WingKin Sung. Inferring a level1 phylogenetic network from a dense set of rooted triplets. In COCOON04, Vol. 3106:462471 of LNCS, springer, 2004. 1 comment Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/ipnrt6_COCOON2004.pdf.



Luay Nakhleh. Phylogenetic Networks. PhD thesis, University of Texas at Austin, U.S.A., 2004. Keywords: distance between networks, evaluation, generation, phylogenetic network, phylogeny, Program SPNet, reconstruction, split, statistical model, tree sibling network. Note: http://www.library.utexas.edu/etd/d/2004/nakhlehl042/nakhlehl042.pdf.



Mike Hallett,
Jens Lagergren and
Ali Tofigh. Simultaneous Identification of Duplications and Lateral Transfers. In RECOMB04, Pages 347356, 2004. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.nada.kth.se/~jensl/p164hallett.pdf.



Mihaela Baroni. Hybrid phylogenies : a graphbased approach to represent reticulate evolution. PhD thesis, University of Canterbury, New Zealand, 2004. Keywords: explicit network, from rooted trees, galled tree, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, regular network. Note: http://ir.canterbury.ac.nz/bitstream/10092/4803/1/baroni_thesis.pdf.





HansJürgen Bandelt,
Vincent Macaulay and
Martin Richards. Median networks: speedy construction and greedy reduction, one simulation, and two case studies from human mtDNA. In MPE, Vol. 16:828, 2000. Keywords: from sequences, from splits, median network, phylogenetic network, phylogeny, reconstruction. Note: http://www.stats.gla.ac.uk/~vincent/papers/speedy.pdf.
Toggle abstract
"Molecular data sets characterized by few phylogenetically informative characters with a broad spectrum of mutation rates, such as intraspecific controlregion sequence variation of human mitochondrial DNA (mtDNA), can be usefully visualized in the form of median networks. Here we provide a stepbystep guide to the construction of such networks by hand. We improve upon a previously implemented algorithm by outlining an efficient parametrized strategy amenable to large data sets, greedy reduction, which makes it possible to reconstruct some of the confounding recurrent mutations. This entails some postprocessing as well, which assists in capturing more parsimonious solutions. To simplify the creation of the resulting network by hand, we describe a speedy approach to network construction, based on a careful planning of the processing order. A coalescent simulation tailored to human mtDNA variation in Eurasia testifies to the usefulness of reduced median networks, while highlighting notorious problems faced by all phylogenetic methods in this context. Finally, we discuss two case studies involving the comparison of characters in the two hypervariable segments of the human mtDNA control region in the light of the worldwide controlregion sequence database, as well as additional restriction fragment length polymorphism information. We conclude that only a minority of the mutations that hit the second segment occur at sites that would have a mutation rate comparable to those at most sites in the first segment. Discarding the known 'noisy' sites of the second segment enhances the analysis. (C) 2000 Academic Press."







Ingo Althöfer. On optimal realizations of finite metric spaces by graphs. In Discrete and Computational Geometry, Vol. 3(1):103122, 1986. Keywords: NP complete, optimal realization, realization. Note: http://dx.doi.org/10.1007/BF02187901.
Toggle abstract
"Graph realizations of finite metric spaces have widespread applications, for example, in biology, economics, and information theory. The main results of this paper are: 1. Finding optimal realizations of integral metrics (which means all distances are integral) is NPcomplete. 2. There exist metric spaces with a continuum of optimal realizations. Furthermore, two conditions necessary for a weighted graph to be an optimal realization are given and an extremal problem arising in connection with the realization problem is investigated. © 1988 SpringerVerlag New York Inc."




