
Gabriel Cardona and
Louxin Zhang. Counting and Enumerating TreeChild Networks and Their Subclasses. In JCSS, Vol. 114:84104, 2020. Keywords: counting, enumeration, explicit network, galled network, galled tree, normal network, phylogenetic network, phylogeny, treechild network.







Leo van Iersel,
Remie Janssen,
Mark Jones,
Yukihiro Murakami and
Norbert Zeh. PolynomialTime Algorithms for Phylogenetic Inference Problems Involving Duplication and Reticulation. In TCBB, Vol. 17(1):1426, 2020. Keywords: hybridization, minimum number, parental hybridization, phylogenetic network, phylogeny, reconstruction, weakly displaying. Note: http://pure.tudelft.nl/ws/portalfiles/portal/71270795/08798653.pdf.









Joan Carles Pons,
Charles Semple and
Mike Steel. Treebased networks: characterisations, metrics, and support trees. In JOMB, Vol. 78(4):899918, 2019. Keywords: characterization, explicit network, from network, phylogenetic network, phylogeny, time consistent network, treebased network. Note: https://arxiv.org/abs/1710.07836.





Janosch Döcker,
Leo van Iersel,
Steven Kelk and
Simone Linz. Deciding the existence of a cherrypicking sequence is hard on two trees. In DAM, Vol. 260:131143, 2019. Keywords: cherrypicking, explicit network, hybridization, minimum number, NP complete, phylogenetic network, phylogeny, reconstruction, temporalhybridization number, time consistent network, treechild network. Note: https://arxiv.org/abs/1712.02965.







Andreas Gunawan,
Hongwei Yan and
Louxin Zhang. Compression of Phylogenetic Networks and Algorithm for the Tree Containment Problem. In JCB, Vol. 25(3), 2019. Keywords: explicit network, phylogenetic network, phylogeny, polynomial, quasireticulationvisible network, reticulationvisible network, tree containment, treechild network. Note: https://arxiv.org/abs/1806.07625.





Yukihiro Murakami,
Leo van Iersel,
Remie Janssen,
Mark Jones and
Vincent Moulton. Reconstructing TreeChild Networks from ReticulateEdgeDeleted Subnetworks. In BMB, Vol. 81:38233863, 2019. Keywords: from subnetworks, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction, treechild network, uniqueness, valid network. Note: https://doi.org/10.1007/s1153801900641w.









Magnus Bordewich,
Charles Semple and
Nihan Tokac. Constructing treechild networks from distance matrices. In Algorithmica, Vol. 80(8):22402259, 2018. Keywords: compressed network, explicit network, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, treechild network, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSN17.pdf.



Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Solving the Tree Containment Problem in Linear Time for Nearly Stable Phylogenetic Networks. In DAM, Vol. 246:6279, 2018. Keywords: explicit network, from network, from rooted trees, nearlystable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://halupecupem.archivesouvertes.fr/hal01575001/en/.



Sarah Bastkowski,
Daniel Mapleson,
Andreas Spillner,
Taoyang Wu,
Monika Balvociute and
Vincent Moulton. SPECTRE: a Suite of PhylogEnetiC Tools for Reticulate Evolution. In BIO, Vol. 34(6):10571058, 2018. Keywords: abstract network, NeighborNet, phylogenetic network, phylogeny, Program FlatNJ, Program QNet, Program SplitsTree, reconstruction, software, split network. Note: https://doi.org/10.1101/169177.



Katharina Huber,
Vincent Moulton,
Charles Semple and
Taoyang Wu. Quarnet inference rules for level1 networks. In BMB, Vol. 80:21372153, 2018. Keywords: explicit network, from quarnets, from subnetworks, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: https://arxiv.org/abs/1711.06720.









Janosch Döcker and
Simone Linz. On the existence of a cherrypicking sequence. In TCS, Vol. 714:3650, 2018. Keywords: cherrypicking, explicit network, from rooted trees, NP complete, phylogenetic network, phylogeny, reconstruction, temporalhybridization number, time consistent network, treechild network. Note: https://arxiv.org/abs/1712.04127.









Magnus Bordewich,
Katharina Huber,
Vincent Moulton and
Charles Semple. Recovering normal networks from shortest intertaxa distance information. In JOMB, Vol. 77(3):571594, 2018. Keywords: explicit network, from distances, normal network, phylogenetic network, phylogeny, polynomial, reconstruction, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BHMS18.pdf.













Sha Zhu and
James H. Degnan. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent. In SB, Vol. 66(2):283298, 2017. Keywords: branch length, coalescent, explicit network, from network, likelihood, phylogenetic network, phylogeny, Program Hybridcoal, Program HybridLambda, Program PhyloNet, software, uniqueness. Note: presentation available at https://www.youtube.com/watch?v=JLYGTfEZG7g.



Andreas Gunawan,
Bhaskar DasGupta and
Louxin Zhang. A decomposition theorem and two algorithms for reticulationvisible networks. In Information and Computation, Vol. 252:161175, 2017. Keywords: cluster containment, explicit network, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, reticulationvisible network, tree containment. Note: https://www.cs.uic.edu/~dasgupta/resume/publ/papers/Infor_Comput_IC4848_final.pdf.





Magnus Bordewich,
Simone Linz and
Charles Semple. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks. In JTB, Vol. 423:112, 2017. Keywords: distance between networks, explicit network, phylogenetic network, phylogeny, reticulationvisible network, SPR distance, treebased network, treechild network. Note: https://simonelinz.files.wordpress.com/2017/04/bls171.pdf.



Edwin Jacox,
Mathias Weller,
Eric Tannier and
Celine Scornavacca. Resolution and reconciliation of nonbinary gene trees with transfers, duplications and losses. In BIO, Vol. 33(7):980987, 2017. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btw778.



Katharina Huber,
Vincent Moulton,
Mike Steel and
Taoyang Wu. Folding and unfolding phylogenetic trees and networks. In JOMB, Vol. 73(6):17611780, 2016. Keywords: compressed network, explicit network, FUstable network, NP complete, phylogenetic network, phylogeny, tree containment, tree sibling network. Note: http://arxiv.org/abs/1506.04438.





Steven Kelk,
Leo van Iersel,
Celine Scornavacca and
Mathias Weller. Phylogenetic incongruence through the lens of Monadic Second Order logic. In JGAA, Vol. 20(2):189215, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, MSOL, phylogenetic network, phylogeny, reconstruction. Note: http://jgaa.info/accepted/2016/KelkIerselScornavaccaWeller2016.20.2.pdf.













Mareike Fischer,
Leo van Iersel,
Steven Kelk and
Celine Scornavacca. On Computing The Maximum Parsimony Score Of A Phylogenetic Network. In SIDMA, Vol. 29(1):559585, 2015. Keywords: APX hard, cluster containment, explicit network, FPT, from network, from sequences, integer linear programming, level k phylogenetic network, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, Program MPNet, reconstruction, software. Note: http://arxiv.org/abs/1302.2430.









Sha Zhu,
James H. Degnan,
Sharyn J. Goldstein and
Bjarki Eldon. HybridLambda: simulation of multiple merger and Kingman gene genealogies in species networks and species trees. In BMCB, Vol. 16(292):17, 2015. Keywords: explicit network, from network, phylogenetic network, phylogeny, Program HybridLambda, simulation, software. Note: http://dx.doi.org/10.1186/s128590150721y.



Jessica W. Leigh and
David Bryant. PopART: fullfeature software for haplotype network construction. In Methods in Ecology and Evolution, Vol. 6(9):11101116, 2015. Keywords: abstract network, from sequences, haplotype network, MedianJoining, phylogenetic network, phylogeny, population genetics, Program PopART, Program TCS, software. Note: http://dx.doi.org/10.1111/2041210X.12410.





Monika Balvociute,
Andreas Spillner and
Vincent Moulton. FlatNJ: A Novel NetworkBased Approach to Visualize Evolutionary and Biogeographical Relationships. In Systematic Biology, Vol. 63(3):383396, 2014. Keywords: abstract network, flat, phylogenetic network, phylogeny, Program FlatNJ, Program SplitsTree, split network. Note: http://dx.doi.org/10.1093/sysbio/syu001.
Toggle abstract
"Split networks are a type of phylogenetic network that allow visualization of conflict in evolutionary data. We present a new method for constructing such networks called FlatNetJoining (FlatNJ). A key feature of FlatNJ is that it produces networks that can be drawn in the plane in which labels may appear inside of the network. For complex data sets that involve, for example, nonneutral molecular markers, this can allow additional detail to be visualized as compared to previous methods such as split decomposition and NeighborNet. We illustrate the application of FlatNJ by applying it to whole HIV genome sequences, where recombination has taken place, fluorescent proteins in corals, where ancestral sequences are present, and mitochondrial DNA sequences from gall wasps, where biogeographical relationships are of interest. We find that the networks generated by FlatNJ can facilitate the study of genetic variation in the underlying molecular sequence data and, in particular, may help to investigate processes such as intralocus recombination. FlatNJ has been implemented in Java and is freely available at www.uea.ac.uk/computing/software/ flatnj. [flat split system; NeighborNet; Phylogenetic network; QNet; split; split network.] © The Author(s) 2014."



Paul Cordue,
Simone Linz and
Charles Semple. Phylogenetic Networks that Display a Tree Twice. In BMB, Vol. 76(10):26642679, 2014. Keywords: from rooted trees, normal network, phylogenetic network, phylogeny, reconstruction, treechild network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/CLS14.pdf.
Toggle abstract
"In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the treelike information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratictime algorithm to solve this problem for a class of networks that is more general than treechild networks. © 2014, Society for Mathematical Biology."



Leo van Iersel and
Simone Linz. A quadratic kernel for computing the hybridization number of multiple trees. In IPL, Vol. 113:318323, 2013. Keywords: explicit network, FPT, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Clustistic, Program MaafB, Program PIRN, reconstruction. Note: http://arxiv.org/abs/1203.4067, poster.
Toggle abstract
"It has recently been shown that the NPhard problem of calculating the minimum number of hybridization events that is needed to explain a set of rooted binary phylogenetic trees by means of a hybridization network is fixedparameter tractable if an instance of the problem consists of precisely two such trees. In this paper, we show that this problem remains fixedparameter tractable for an arbitrarily large set of rooted binary phylogenetic trees. In particular, we present a quadratic kernel. © 2013 Elsevier B.V."



Chris Whidden,
Robert G. Beiko and
Norbert Zeh. FixedParameter Algorithms for Maximum Agreement Forests. In SICOMP, Vol. 42(4):14311466, 2013. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: http://arxiv.org/abs/1108.2664, slides.
Toggle abstract
"We present new and improved fixedparameter algorithms for computing maximum agreement forests of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree pruneandregraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution. Our algorithm for computing maximum acyclic agreement forests is the first depthbounded search algorithm for this problem. Our algorithms substantially outperform the best previous algorithms for these problems. © 2013 Society for Industrial and Applied Mathematics."



Peter J. Humphries,
Simone Linz and
Charles Semple. On the complexity of computing the temporal hybridization number for two phylogenies. In DAM, Vol. 161:871880, 2013. Keywords: agreement forest, APX hard, characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/TAFapx.pdf.
Toggle abstract
"Phylogenetic networks are now frequently used to explain the evolutionary history of a set of species for which a collection of gene trees, reconstructed from genetic material of different parts of the species' genomes, reveal inconsistencies. However, in the context of hybridization, the reconstructed networks are often not temporal. If a hybridization network is temporal, then it satisfies the time constraint of instantaneously occurring hybridization events; i.e. all species that are involved in such an event coexist in time. Furthermore, although a collection of phylogenetic trees can often be merged into a hybridization network that is temporal, many algorithms do not necessarily find such a network since their primary optimization objective is to minimize the number of hybridization events. In this paper, we present a characterization for when two rooted binary phylogenetic trees admit a temporal hybridization network. Furthermore, we show that the underlying optimization problem is APXhard and, therefore, NPhard. Thus, unless P=NP, it is unlikely that there are efficient algorithms for either computing an exact solution or approximating it within a ratio arbitrarily close to one. © 2012 Elsevier B.V. All rights reserved."





Eric Bapteste,
Leo van Iersel,
Axel Janke,
Scott Kelchner,
Steven Kelk,
James O. McInerney,
David A. Morrison,
Luay Nakhleh,
Mike Steel,
Leen Stougie and
James B. Whitfield. Networks: expanding evolutionary thinking. In Trends in Genetics, Vol. 29(8):439441, 2013. Keywords: abstract network, explicit network, phylogenetic network, phylogeny, reconstruction. Note: http://bioinf.nuim.ie/wpcontent/uploads/2013/06/BaptesteTiG2013.pdf.
Toggle abstract
"Networks allow the investigation of evolutionary relationships that do not fit a tree model. They are becoming a leading tool for describing the evolutionary relationships between organisms, given the comparative complexities among genomes. © 2013 Elsevier Ltd."



Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):18791890, 2013. Keywords: characterization, cherrypicking, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimumtemporalhybridization number for a set P of rooted binary phylogenetic trees is NPhard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimumtemporal hybridization number for two trees is fixedparameter tractable. © 2013 Society for Mathematical Biology."



Jeremy G. Sumner,
Barbara R. Holland and
Peter D. Jarvis. The algebra of the general Markov model on phylogenetic trees and networks. In BMB, Vol. 74(4):858880, 2012. Keywords: abstract network, phylogenetic network, phylogeny, split, split network, statistical model. Note: http://arxiv.org/abs/1012.5165.
Toggle abstract
"It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuoustime Markov chain together with the "splitting" operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications. © 2011 Society for Mathematical Biology."



Andreas Spillner and
Vincent Moulton. Optimal algorithms for computing edge weights in planar splitnetworks. In Journal of Applied Mathematics and Computing, Vol. 39(12):113, 2012. Keywords: abstract network, from distances, phylogenetic network, phylogeny, reconstruction, split, split network. Note: http://dx.doi.org/10.1007/s121900110506z.
Toggle abstract
"In phylogenetics, biologists commonly compute split networks when trying to better understand evolutionary data. These graphtheoretical structures represent collections of weighted bipartitions or splits of a finite set, and provide a means to display conflicting evolutionary signals. The weights associated to the splits are used to scale the edges in the network and are often computed using some distance matrix associated with the data. In this paper we present optimal polynomial time algorithms for three basic problems that arise in this context when computing split weights for planar splitnetworks. These generalize algorithms that have been developed for special classes of split networks (namely, trees and outerlabeled planar networks). As part of our analysis, we also derive a Crofton formula for full flat split systems, structures that naturally arise when constructing planar splitnetworks. © 2011 Korean Society for Computational and Applied Mathematics."



Magnus Bordewich and
Charles Semple. Budgeted Nature Reserve Selection with diversity feature loss and arbitrary split systems. In JOMB, Vol. 64(1):6985, 2012. Keywords: abstract network, approximation, diversity, phylogenetic network, polynomial, split network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS11.pdf.
Toggle abstract
"Arising in the context of biodiversity conservation, the Budgeted Nature Reserve Selection (BNRS) problem is to select, subject to budgetary constraints, a set of regions to conserve so that the phylogenetic diversity (PD) of the set of species contained within those regions is maximized. Here PD is measured across either a single rooted tree or a single unrooted tree. Nevertheless, in both settings, this problem is NPhard. However, it was recently shown that, for each setting, there is a polynomialtime (11/e)approximation algorithm for it and that this algorithm is tight. In the first part of the paper, we consider two extensions of BNRS. In the rooted setting we additionally allow for the disappearance of features, for varying survival probabilities across species, and for PD to be measured across multiple trees. In the unrooted setting, we extend to arbitrary split systems. We show that, despite these additional allowances, there remains a polynomialtime (11/e)approximation algorithm for each extension. In the second part of the paper, we resolve a complexity problem on computing PD across an arbitrary split system left open by Spillner et al. © 2011 SpringerVerlag."



Celine Scornavacca,
Simone Linz and
Benjamin Albrecht. A first step towards computing all hybridization networks for two rooted binary phylogenetic trees. In JCB, Vol. 19:12271242, 2012. Keywords: agreement forest, explicit network, FPT, from rooted trees, phylogenetic network, phylogeny, Program Dendroscope, Program Hybroscale, reconstruction. Note: http://arxiv.org/abs/1109.3268.
Toggle abstract
"Recently, considerable effort has been put into developing fast algorithms to reconstruct a rooted phylogenetic network that explains two rooted phylogenetic trees and has a minimum number of hybridization vertices. With the standard app1235roach to tackle this problem being combinatorial, the reconstructed network is rarely unique. From a biological point of view, it is therefore of importance to not only compute one network, but all possible networks. In this article, we make a first step toward approaching this goal by presenting the first algorithmcalled allMAAFsthat calculates all maximumacyclicagreement forests for two rooted binary phylogenetic trees on the same set of taxa. © Copyright 2012, Mary Ann Liebert, Inc. 2012."



Steven Kelk,
Leo van Iersel,
Nela Lekic,
Simone Linz,
Celine Scornavacca and
Leen Stougie. Cycle killer... qu'estce que c'est? On the comparative approximability of hybridization number and directed feedback vertex set. In SIDMA, Vol. 26(4):16351656, 2012. Keywords: agreement forest, approximation, explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program CycleKiller, reconstruction. Note: http://arxiv.org/abs/1112.5359, about the title.
Toggle abstract
"We show that the problem of computing the hybridization number of two rooted binary phylogenetic trees on the same set of taxa X has a constant factor polynomialtime approximation if and only if the problem of computing a minimumsize feedback vertex set in a directed graph (DFVS) has a constant factor polynomialtime approximation. The latter problem, which asks for a minimum number of vertices to be removed from a directed graph to transform it into a directed acyclic graph, is one of the problems in Karp's seminal 1972 list of 21 NPcomplete problems. Despite considerable attention from the combinatorial optimization community, it remains to this day unknown whether a constant factor polynomialtime approximation exists for DFVS. Our result thus places the (in)approximability of hybridization number in a much broader complexity context, and as a consequence we obtain that it inherits inapproximability results from the problem Vertex Cover. On the positive side, we use results from the DFVS literature to give an O(log r log log r) approximation for the hybridization number where r is the correct value. Copyright © by SIAM."



Yun Yu,
James H. Degnan and
Luay Nakhleh. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. In PLoS Genetics, Vol. 8(4):e1002660, 2012. Keywords: AIC, BIC, explicit network, hybridization, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1371/journal.pgen.1002660.
Toggle abstract
"Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. © 2012 Yu et al."



Dan Levy and
Lior Pachter. The NeighborNet Algorithm. In Advances in Applied Mathematics, Vol. 47(2):240258, 2011. Keywords: abstract network, circular split system, evaluation, from distances, NeighborNet, phylogenetic network, phylogeny, split network. Note: http://arxiv.org/abs/math/0702515.
Toggle abstract
"The neighborjoining algorithm is a popular phylogenetics method for constructing trees from dissimilarity maps. The neighbornet algorithm is an extension of the neighborjoining algorithm and is used for constructing split networks. We begin by describing the output of neighbornet in terms of the tessellation of M̄0n(R) by associahedra. This highlights the fact that neighbornet outputs a tree in addition to a circular ordering and we explain when the neighbornet tree is the neighborjoining tree. A key observation is that the tree constructed in existing implementations of neighbornet is not a neighborjoining tree. Next, we show that neighbornet is a greedy algorithm for finding circular split systems of minimal balanced length. This leads to an interpretation of neighbornet as a greedy algorithm for the traveling salesman problem. The algorithm is optimal for Kalmanson matrices, from which it follows that neighbornet is consistent and has optimal radius 12. We also provide a statistical interpretation for the balanced length for a circular split system as the length based on weighted least squares estimates of the splits. We conclude with applications of these results and demonstrate the implications of our theorems for a recently published comparison of Papuan and Austronesian languages. © 2010 Elsevier Inc. All rights reserved."



Shlomo Moran,
Sagi Snir and
WingKin Sung. Partial Convex Recolorings of Trees and Galled Networks: Tight Upper and Lower bounds. In ACM Transactions on Algorithms, Vol. 7(4), 2011. Keywords: evaluation, galled tree, phylogenetic network. Note: http://www.cs.technion.ac.il/~moran/r/PS/gnetsTOA7Feb2007.pdf.
Toggle abstract
"A coloring of a graph is convex if the vertices that pertain to any color induce a connected subgraph; a partial coloring (which assigns colors to a subset of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring has applications in fields such as phylogenetics, communication or transportation networks, etc. When a coloring of a graph is not convex, a natural question is how far it is from a convex one. This problem is denoted as convex recoloring (CR).While the initial works on CR defined and studied the problem on trees, recent efforts aim at either generalizing the underlying graphs or specializing the input colorings. In this work, we extend the underlying graph and the input coloring to partially colored galled networks. We show that although determining whether a coloring is convex on an arbitrary network is hard, it can be found efficiently on galled networks. We present a fixed parameter tractable algorithm that finds the recoloring distance of such a network whose running time is quadratic in the network size and exponential in that distance. This complexity is achieved by amortized analysis that uses a novel technique for contracting colored graphs that seems to be of independent interest. © 2011 ACM."



Josh Voorkamp né Collins,
Simone Linz and
Charles Semple. Quantifying hybridization in realistic time. In JCB, Vol. 18(10):13051318, 2011. Keywords: explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, software. Note: http://wwwcsif.cs.ucdavis.edu/~linzs/CLS10_interleave.pdf, software available at http://www.math.canterbury.ac.nz/~c.semple/software.shtml.
Toggle abstract
"Recently, numerous practical and theoretical studies in evolutionary biology aim at calculating the extent to which reticulationfor example, horizontal gene transfer, hybridization, or recombinationhas influenced the evolution for a set of presentday species. It has been shown that inferring the minimum number of hybridization events that is needed to simultaneously explain the evolutionary history for a set of trees is an NPhard and also fixedparameter tractable problem. In this article, we give a new fixedparameter algorithm for computing the minimum number of hybridization events for when two rooted binary phylogenetic trees are given. This newly developed algorithm is based on interleavinga technique using repeated kernelization steps that are applied throughout the exhaustive search part of a fixedparameter algorithm. To show that our algorithm runs efficiently to be applicable to a wide range of practical problem instances, we apply it to a grass data set and highlight the significant improvements in terms of running times in comparison to an algorithm that has previously been implemented. © 2011, Mary Ann Liebert, Inc."



Yun Yu,
Cuong Than,
James H. Degnan and
Luay Nakhleh. Coalescent Histories on Phylogenetic Networks and Detection of Hybridization Despite Incomplete Lineage Sorting. In Systematic Biology, Vol. 60(2):138149, 2011. Keywords: coalescent, hybridization, lineage sorting, reconstruction, statistical model. Note: http://www.cs.rice.edu/~nakhleh/Papers/YuEtAlSB11.pdf.
Toggle abstract
"Analyses of the increasingly available genomic data continue to reveal the extent of hybridization and its role in the evolutionary diversification of various groups of species. We show, through extensive coalescentbased simulations of multilocus data sets on phylogenetic networks, how divergence times before and after hybridization events can result in incomplete lineage sorting with gene tree incongruence signatures identical to those exhibited by hybridization. Evolutionary analysis of such data under the assumption of a species tree model can miss all hybridization events, whereas analysis under the assumption of a species network model would grossly overestimate hybridization events. These issues necessitate a paradigm shift in evolutionary analysis under these scenarios, from a model that assumes a priori a single source of gene tree incongruence to one that integrates multiple sources in a unifying framework. We propose a framework of coalescence within the branches of a phylogenetic network and show how this framework can be used to detect hybridization despite incomplete lineage sorting. We apply the model to simulated data and show that the signature of hybridization can be revealed as long as the interval between the divergence times of the species involved in hybridization is not too small. We reanalyze a data set of 106 loci from 7 ingroup Saccharomyces species for which a species tree with no hybridization has been reported in the literature. Our analysis supports the hypothesis that hybridization occurred during the evolution of this group, explaining a large amount of the incongruence in the data. Our findings show that an integrative approach to gene tree incongruence and its reconciliation is needed. Our framework will help in systematically analyzing genomic data for the occurrence of hybridization and elucidating its evolutionary role. [Coalescent history; incomplete lineage sorting; hybridization; phylogenetic network.]. © 2011 The Author(s)."



Leo van Iersel,
Charles Semple and
Mike Steel. Quantifying the Extent of Lateral Gene Transfer Required to Avert a 'Genome of Eden'. In BMB, Vol. 72:1783–1798, 2010. Note: http://www.win.tue.nl/~liersel/LGT.pdf.
Toggle abstract
"The complex pattern of presence and absence of many genes across different species provides tantalising clues as to how genes evolved through the processes of gene genesis, gene loss, and lateral gene transfer (LGT). The extent of LGT, particularly in prokaryotes, and its implications for creating a 'network of life' rather than a 'tree of life' is controversial. In this paper, we formally model the problem of quantifying LGT, and provide exact mathematical bounds, and new computational results. In particular, we investigate the computational complexity of quantifying the extent of LGT under the simple models of gene genesis, loss, and transfer on which a recent heuristic analysis of biological data relied. Our approach takes advantage of a relationship between LGT optimization and graphtheoretical concepts such as tree width and network flow. © 2010 Society for Mathematical Biology."



Simone Linz,
Charles Semple and
Tanja Stadler. Analyzing and reconstructing reticulation networks under timing constraints. In JOMB, Vol. 61(5):715737, 2010. Keywords: explicit network, from rooted trees, hybridization, lateral gene transfer, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://dx.doi.org/10.1007/s002850090319y..
Toggle abstract
"Reticulation networks are now frequently used to model the history of life for various groups of species whose evolutionary past is likely to include reticulation events such as horizontal gene transfer or hybridization. However, the reconstructed networks are rarely guaranteed to be temporal. If a reticulation network is temporal, then it satisfies the two biologically motivated timing constraints of instantaneously occurring reticulation events and successively occurring speciation events. On the other hand, if a reticulation network is not temporal, it is always possible to make it temporal by adding a number of additional unsampled or extinct taxa. In the first half of the paper, we show that deciding whether a given number of additional taxa is sufficient to transform a nontemporal reticulation network into a temporal one is an NPcomplete problem. As one is often given a set of gene trees instead of a network in the context of hybridization, this motivates the second half of the paper which provides an algorithm, called TemporalHybrid, for reconstructing a temporal hybridization network that simultaneously explains the ancestral history of two trees or indicates that no such network exists. We further derive two methods to decide whether or not a temporal hybridization network exists for two given trees and illustrate one of the methods on a grass data set. © 2009 The Author(s)."





Leo van Iersel,
Charles Semple and
Mike Steel. Locating a tree in a phylogenetic network. In IPL, Vol. 110(23), 2010. Keywords: cluster containment, explicit network, from network, level k phylogenetic network, normal network, NP complete, phylogenetic network, polynomial, regular network, time consistent network, tree containment, tree sibling network, treechild network. Note: http://arxiv.org/abs/1006.3122.
Toggle abstract
"Phylogenetic trees and networks are leaflabelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NPcomplete in general. In this article, we consider the restriction of these problems to several wellstudied classes of phylogenetic networks. We show that Tree Containment is polynomialtime solvable for normal networks, for binary treechild networks, and for levelk networks. On the other hand, we show that, even for treesibling, timeconsistent, regular networks, both Tree Containment and Cluster Containment remain NPcomplete. © 2010 Elsevier B.V. All rights reserved."



Sagi Snir and
Edward Trifonov. A Novel Technique for Detecting Putative Horizontal Gene Transfer in the Sequence Space. In JCB, Vol. 17(11):15351548, 2010. Keywords: from sequences, phylogenetic network, phylogeny, reconstruction. Note: http://research.haifa.ac.il/~ssagi/published%20papers/JCBHGT.pdf.
Toggle abstract
"Horizontal transfer (HT) is the event of a DNA sequence being transferred between species not by inheritance. This phenomenon violates the treelike evolution of the species under study turning the trees into networks. At the sequence level, HT offers basic characteristics that enable not only clear identification and distinguishing from other sequence similarity cases but also the possibility of dating the events. We developed a novel, selfcontained technique to identify relatively recent horizontal transfer elements (HTEs) in the sequences. Appropriate formalism allows one to obtain confidence values for the events detected. The technique does not rely on such problematic prerequisites as reliable phylogeny and/or statistically justified pairwise sequence alignment. In conjunction with the unique properties of HT, it gives rise to a twolevel sequence similarity algorithm that, to the best of our knowledge, has not been explored. From evolutionary perspective, the novelty of the work is in the combination of small scale and large scale mutational events. The technique is employed on both simulated and real biological data. The simulation results show high capability of discriminating between HT and conserved regions. On the biological data, the method detected documented HTEs along with their exact locations in the recipient genomes. Supplementary Material is available online at www.libertonline.com/cmb. Copyright 2010, Mary Ann Liebert, Inc."





Sagi Snir and
Tamir Tuller. The NETHMM approach: Phylogenetic Network Inference by Combining Maximum Likelihood and Hidden Markov Models. In JBCB, Vol. 7(4):625644, 2009. Keywords: explicit network, from sequences, HMM, lateral gene transfer, likelihood, phylogenetic network, phylogeny, statistical model. Note: http://research.haifa.ac.il/~ssagi/published%20papers/SnirNETHMMJBCB2009.pdf.
Toggle abstract
"Horizontal gene transfer (HGT) is the event of transferring genetic material from one lineage in the evolutionary tree to a different lineage. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Although the prevailing assumption is of complete HGT, cases of partial HGT (which are also named chimeric HGT) where only part of a gene is horizontally transferred, have also been reported, albeit less frequently. In this work we suggest a new probabilistic model, the NETHMM, for analyzing and modeling phylogenetic networks. This new model captures the biologically realistic assumption that neighboring sites of DNA or amino acid sequences are not independent, which increases the accuracy of the inference. The model describes the phylogenetic network as a Hidden Markov Model (HMM), where each hidden state is related to one of the network's trees. One of the advantages of the NETHMM is its ability to infer partial HGT as well as complete HGT. We describe the properties of the NETHMM, devise efficient algorithms for solving a set of problems related to it, and implement them in software. We also provide a novel complementary significance test for evaluating the fitness of a model (NETHMM) to a given dataset. Using NETHMM, we are able to answer interesting biological questions, such as inferring the length of partial HGT's and the affected nucleotides in the genomic sequences, as well as inferring the exact location of HGT events along the tree branches. These advantages are demonstrated through the analysis of synthetical inputs and three different biological inputs. © 2009 Imperial College Press."



Stefan Grünewald,
Katharina Huber,
Vincent Moulton,
Charles Semple and
Andreas Spillner. Characterizing weak compatibility in terms of weighted quartets. In Advances in Applied Mathematics, Vol. 42(3):329341, 2009. Keywords: abstract network, characterization, from quartets, split network, weak hierarchy. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/GHMSS08.pdf, slides at http://www.lirmm.fr/miep08/slides/12_02_huber.pdf.







Stefan Grünewald,
Jacobus Koolen and
WooSun Lee. Quartets in maximal weakly compatible split systems. In Applied Mathematics Letters, Vol. 22(6):16041608, 2009. Note: http://dx.doi.org/10.1016/j.aml.2009.05.006.
Toggle abstract
"Weakly compatible split systems are a generalization of unrooted evolutionary trees and are commonly used to display reticulate evolution or ambiguity in biological data. They are collections of bipartitions of a finite set X of taxa (e.g. species) with the property that, for every four taxa, at least one of the three bipartitions into two pairs (quartets) is not induced by any of the Xsplits. We characterize all split systems where exactly two quartets from every quadruple are induced by some split. On the other hand, we construct maximal weakly compatible split systems where the number of induced quartets per quadruple tends to 0 with the number of taxa going to infinity. © 2009."



Andreas W. M. Dress,
Katharina Huber,
Jacobus Koolen and
Vincent Moulton. Compatible decompositions and block realizations of finite metrics. In EJC, Vol. 29(7):16171633, 2008. Keywords: abstract network, block realization, from distances, phylogenetic network, phylogeny, realization, reconstruction. Note: http://www.ims.nus.edu.sg/preprints/200721.pdf.
Toggle abstract
"Given a metric D defined on a finite set X, we define a finite collection D of metrics on X to be a compatible decomposition of D if any two distinct metrics in D are linearly independent (considered as vectors in RX × X), D = ∑d ∈ D d holds, and there exist points x, x′ ∈ X for any two distinct metrics d, d′ in D such that d (x, y) d′ (x′, y) = 0 holds for every y ∈ X. In this paper, we show that such decompositions are in onetoone correspondence with (isomorphism classes of) block realizations of D, that is, graph realizations G of D for which G is a block graph and for which every vertex in G not labelled by X has degree at least 3 and is a cut point of G. This generalizes a fundamental result in phylogenetic combinatorics that states that a metric D defined on X can be realized by a tree if and only if there exists a compatible decomposition D of D such that all metrics d ∈ D are split metrics, and lays the foundation for a more general theory of metric decompositions that will be explored in future papers. © 2007 Elsevier Ltd. All rights reserved."



James B. Whitfield,
Sydney A. Cameron,
Daniel H. Huson and
Mike Steel. Filtered ZClosure Supernetworks for Extracting and Visualizing Recurrent Signal from Incongruent Gene Trees. In Systematic Biology, Vol. 57(6):939947, 2008. Keywords: abstract network, from unrooted trees, phylogenetic network, phylogeny, Program SplitsTree, split, split network, supernetwork. Note: http://www.life.uiuc.edu/scameron/pdfs/Filtered%20Zclosure%20SystBiol.pdf.



Barbara R. Holland,
Steffi Benthin,
Peter J. Lockhart,
Vincent Moulton and
Katharina Huber. Using supernetworks to distinguish hybridization from lineagesorting. In BMCEB, Vol. 8(202), 2008. Keywords: explicit network, from unrooted trees, hybridization, lineage sorting, phylogenetic network, phylogeny, reconstruction, supernetwork. Note: http://dx.doi.org/10.1186/147121488202.
Toggle abstract
"Background. A simple and widely used approach for detecting hybridization in phylogenies is to reconstruct gene trees from independent gene loci, and to look for gene tree incongruence. However, this approach may be confounded by factors such as poor taxonsampling and/or incomplete lineagesorting. Results. Using coalescent simulations, we investigated the potential of supernetwork methods to differentiate between gene tree incongruence arising from taxon sampling and incomplete lineagesorting as opposed to hybridization. For few hybridization events, a large number of independent loci, and wellsampled taxa across these loci, we found that it was possible to distinguish incomplete lineagesorting from hybridization using the filtered Zclosure and Qimputation supernetwork methods. Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that countbased filtering was the most effective filtering technique. Conclusion. Filtered supernetworks provide a tool for detecting and identifying hybridization events in phylogenies, a tool that should become increasingly useful in light of current genome sequencing initiatives and the ease with which large numbers of independent gene loci can be determined using new generation sequencing technologies. © 2008 Holland et al; licensee BioMed Central Ltd."



Magnus Bordewich,
Simone Linz,
Katherine St. John and
Charles Semple. A reduction algorithm for computing the hybridization number of two trees. In EBIO, Vol. 3:8698, 2007. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNumber. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BLSS07.pdf.



Magnus Bordewich and
Charles Semple. Computing the minimum number of hybridization events for a consistent evolutionary history. In DAM, Vol. 155:914918, 2007. Keywords: agreement forest, approximation, APX hard, explicit network, from rooted trees, hybridization, inapproximability, NP complete, phylogenetic network, phylogeny, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS06a.pdf.





Dan Gusfield,
Dean Hickerson and
Satish Eddhu. An efficiently computed lower bound on the number of recombinations in phylogenetic networks: Theory and empirical study. In DAM, Vol. 155(67):806830, 2007. Note: http://wwwcsif.cs.ucdavis.edu/~gusfield/cclowerbound.pdf.
Toggle abstract
"Phylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not treelike. One of the most important biological operations is recombination between two sequences. An established problem [J. Hein, Reconstructing evolution of sequences subject to recombination using parsimony, Math. Biosci. 98 (1990) 185200; J. Hein, A heuristic method to reconstruct the history of sequences subject to recombination, J. Molecular Evoluation 36 (1993) 396405; Y. Song, J. Hein, Parsimonious reconstruction of sequence evolution and haplotype blocks: finding the minimum number of recombination events, in: Proceedings of 2003 Workshop on Algorithms in Bioinformatics, Berlin, Germany, 2003, Lecture Notes in Computer Science, Springer, Berlin; Y. Song, J. Hein, On the minimum number of recombination events in the evolutionary history of DNA sequences, J. Math. Biol. 48 (2003) 160186; L. Wang, K. Zhang, L. Zhang, Perfect phylogenetic networks with recombination, J. Comput. Biol. 8 (2001) 6978; S.R. Myers, R.C. Griffiths, Bounds on the minimum number of recombination events in a sample history, Genetics 163 (2003) 375394; V. Bafna, V. Bansal, Improved recombination lower bounds for haplotype data, in: Proceedings of RECOMB, 2005; Y. Song, Y. Wu, D. Gusfield, Efficient computation of close lower and upper bounds on the minimum number of needed recombinations in the evolution of biological sequences, Bioinformatics 21 (2005) i413i422. Bioinformatics (Suppl. 1), Proceedings of ISMB, 2005, D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173213; D. Gusfield, Optimal, efficient reconstruction of rootunknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381398] is to find a phylogenetic network that derives an input set of sequences, minimizing the number of recombinations used. No efficient, general algorithm is known for this problem. Several papers consider the problem of computing a lower bound on the number of recombinations needed. In this paper we establish a new, efficiently computed lower bound. This result is useful in methods to estimate the number of needed recombinations, and also to prove the optimality of algorithms for constructing phylogenetic networks under certain conditions [D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173213; D. Gusfield, Optimal, efficient reconstruction of rootunknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381398; D. Gusfield, Optimal, efficient reconstruction of rootunknown phylogenetic networks with constrained recombination, Technical Report, Department of Computer Science, University of California, Davis, CA, 2004]. The lower bound is based on a structural, combinatorial insight, using only the site conflicts and incompatibilities, and hence it is fundamental and applicable to many biological phenomena other than recombination, for example, when gene conversions or recurrent or back mutations or crossspecies hybridizations cause the phylogenetic history to deviate from a tree structure. In addition to establishing the bound, we examine its use in more complex lower bound methods, and compare the bounds obtained to those obtained by other established lower bound methods. © 2006 Elsevier B.V. All rights reserved."



Barbara R. Holland,
Glenn Conner,
Katharina Huber and
Vincent Moulton. Imputing Supertrees and Supernetworks from Quartets. In Systematic Biology, Vol. 56(1):5767, 2007. Keywords: abstract network, from unrooted trees, phylogenetic network, phylogeny, Program Quartet, reconstruction, split network, supernetwork. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.3215.
Toggle abstract
"Inferring species phylogenies is an important part of understanding molecular evolution. Even so, it is well known that an accurate phylogenetic tree reconstruction for a single gene does not always necessarily correspond to the species phylogeny. One commonly accepted strategy to cope with this problem is to sequence many genes; the way in which to analyze the resulting collection of genes is somewhat more contentious. Supermatrix and supertree methods can be used, although these can suppress conflicts arising from true differences in the gene trees caused by processes such as lineage sorting, horizontal gene transfer, or gene duplication and loss. In 2004, Huson et al. (IEEE/ACM Trans. Comput. Biol. Bioinformatics 1:151158) presented the Zclosure method that can circumvent this problem by generating a supernetwork as opposed to a supertree. Here we present an alternative way for generating supernetworks called Qimputation. In particular, we describe a method that uses quartet information to add missing taxa into gene trees. The resulting trees are subsequently used to generate consensus networks, networks that generalize strict and majorityrule consensus trees. Through simulations and application to real data sets, we compare Qimputation to the matrix representation with parsimony (MRP) supertree method and Zclosure, and demonstrate that it provides a useful complementary tool. Copyright © Society of Systematic Biologists."





Cam Thach Nguyen,
Nguyen Bao Nguyen,
WingKin Sung and
Louxin Zhang. Reconstructing Recombination Network from Sequence Data: The Small Parsimony Problem. In TCBB, Vol. 4(3):394402, 2007. Keywords: explicit network, from sequences, labeling, NP complete, parsimony, phylogenetic network, phylogeny. Note: http://www.cs.washington.edu/homes/ncthach/Papers/TCBB2007.pdf.



David Bryant,
Vincent Moulton and
Andreas Spillner. Consistency of the NeighborNet Algorithm. In AMB, Vol. 2(8), 2007. Keywords: abstract network, consistency, from distances, NeighborNet. Note: http://dx.doi.org/10.1186/1748718828.
Toggle abstract
"Background: NeighborNet is a novel method for phylogenetic analysis that is currently being widely used in areas such as virology, bacteriology, and plant evolution. Given an input distance matrix, NeighborNet produces a phylogenetic network, a generalization of an evolutionary or phylogenetic tree which allows the graphical representation of conflicting phylogenetic signals. Results: In general, any network construction method should not depict more conflict than is found in the data, and, when the data is fitted well by a tree, the method should return a network that is close to this tree. In this paper we provide a formal proof that NeighborNet satisfies both of these requirements so that, in particular, NeighborNet is statistically consistent on circular distances. © 2007 Bryant et al; licensee BioMed Central Ltd."



HansJürgen Bandelt and
Arne Dür. Translating DNA data tables into quasimedian networks for parsimony analysis and error detection. In MPE, Vol. 42(1):256271, 2007. Keywords: abstract network, from sequences, parsimony, phylogenetic network, phylogeny, quasimedian network, reconstruction. Note: http://dx.doi.org/10.1016/j.ympev.2006.07.013.
Toggle abstract
"Every DNA data table can be turned into a quasimedian network that faithfully represents the data. We show that for (weighted) condensed data tables the associated network harbors all most parsimonious reconstructions for any tree that connects the sampled haplotypes. Structural features of this network can be computed directly from the data table. The key principle repeatedly used is that the quasimedian network is uniquely determined by the subtables for pairs of characters. The translation of a table into a network enhances the understanding of the properties of the data in regard to homoplasy and potential artifacts. The total number of nodes of such a network measures the complexity of the data. In particular, networks that display the results of filter analyses by which hotspot mutations are removed help to detect data idiosyncrasies and thus pinpoint sequencing problems. A pertinent example drawn from human mtDNA illustrates these points. © 2006 Elsevier Inc. All rights reserved."



Mihaela Baroni,
Charles Semple and
Mike Steel. Hybrids in Real Time. In Systematic Biology, Vol. 55(1):4656, 2006. Keywords: agreement forest, from rooted trees, phylogenetic network, phylogeny, polynomial, reconstruction, time consistent network. Note: http://www.math.canterbury.ac.nz/~m.steel/Non_UC/files/research/hybrids.pdf.
Toggle abstract
"We describe some new and recent results that allow for the analysis and representation of reticulate evolution by nontree networks. In particular, we (1) present a simple result to show that, despite the presence of reticulation, there is always a welldefined underlying tree that corresponds to those parts of life that do not have a history of reticulation; (2) describe and apply new theory for determining the smallest number of hybridization events required to explain conflicting gene trees; and (3) present a new algorithm to determine whether an arbitrary rooted network can be realized by contemporaneous reticulation events. We illustrate these results with examples. Copyright © Society of Systematic Biologists."



Daniel H. Huson and
David Bryant. Application of Phylogenetic Networks in Evolutionary Studies. In MBE, Vol. 23(2):254267, 2006. Keywords: abstract network, phylogenetic network, phylogeny, Program SplitsTree, software, survey. Note: http://dx.doi.org/10.1093/molbev/msj030, software available from www.splitstree.org.
Toggle abstract
"The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a treelike manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees. © The Author 2005. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."



Jesper Jansson and
WingKin Sung. Inferring a level1 phylogenetic network from a dense set of rooted triplets. In TCS, Vol. 363(1):6068, 2006. 1 comment Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/ipnrt8_TCS2006.pdf.
Toggle abstract
"We consider the following problem: Given a set T of rooted triplets with leaf set L, determine whether there exists a phylogenetic network consistent with T, and if so, construct one. We show that if no restrictions are placed on the hybrid nodes in the solution, the problem is trivially solved in polynomial time by a simple sorting networkbased construction. For the more interesting (and biologically more motivated) case where the solution is required to be a level1 phylogenetic network, we present an algorithm solving the problem in O ( T 2) time when T is dense, i.e., when T contains at least one rooted triplet for each cardinality three subset of L. We also give an O ( T 5 / 3)time algorithm for finding the set of all phylogenetic networks having a single hybrid node attached to exactly one leaf (and having no other hybrid nodes) that are consistent with a given dense set of rooted triplets. © 2006 Elsevier B.V. All rights reserved."



Jesper Jansson,
Nguyen Bao Nguyen and
WingKin Sung. Algorithms for Combining Rooted Triplets into a Galled Phylogenetic Network. In SICOMP, Vol. 35(5):10981121, 2006. 1 comment Keywords: approximation, explicit network, from triplets, galled tree, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/triplets_to_gn7_SICOMP2006.pdf.
Toggle abstract
"This paper considers the problem of determining whether a given set Τ of rooted triplets can be merged without conflicts into a galled phylogenetic network and, if so, constructing such a network. When the input Τ is dense, we solve the problem in O(Τ) time, which is optimal since the size of the input is Θ(Τ). In comparison, the previously fastest algorithm for this problem runs in O(Τ2) time. We also develop an optimal O(Τ)time algorithm for enumerating all simple phylogenetic networks leaflabeled by L that are consistent with Τ, where L is the set of leaf labels in Τ, which is used by our main algorithm. Next, we prove that the problem becomes NPhard if extended to nondense inputs, even for the special case of simple phylogenetic networks. We also show that for every positive integer n, there exists some set Τ of rooted triplets on n leaves such that any galled network can be consistent with at most 0.4883 ·Τ of the rooted triplets in Τ. On the other hand, we provide a polynomialtime approximation algorithm that always outputs a galled network consistent with at least a factor of 5/12 (> 0.4166) of the rooted triplets in Τ. © 2006 Society for Industrial and Applied Mathematics."



Guohua Jin,
Luay Nakhleh,
Sagi Snir and
Tamir Tuller. Maximum Likelihood of Phylogenetic Networks. In BIO, Vol. 22(21):26042611, 2006. Keywords: explicit network, likelihood, phylogenetic network, phylogeny, Program Nepal, reconstruction. Note: http://www.cs.rice.edu/~nakhleh/Papers/NetworksML06.pdf, supplementary material: http://www.cs.rice.edu/~nakhleh/Papers/SuppML.pdf.





Robert G. Beiko and
Nicholas Hamilton. Phylogenetic identification of lateral genetic transfer events. In BMCEB, Vol. 6(15), 2006. Keywords: evaluation, from rooted trees, from unrooted trees, lateral gene transfer, Program EEEP, Program HorizStory, Program LatTrans, reconstruction, software, SPR distance. Note: http://dx.doi.org/10.1186/14712148615.
Toggle abstract
"Background: Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NPhard for comparisons between rooted trees, and may be so for unrooted trees as well. Results: Efficient Evaluation of Edit Paths (EEEP) is a new tree comparison algorithm that uses evolutionarily reasonable constraints to identify and eliminate many unproductive search avenues, reducing the time required to solve many edit path problems. The performance of EEEP compares favourably to that of other algorithms when applied to strictly bifurcating trees with specified numbers of SPR operations. We also used EEEP to recover edit paths from over 19 000 unrooted, incompletely resolved protein trees containing up to 144 taxa as part of a large phylogenomic study. While inferred protein trees were far more similar to a reference supertree than random trees were to each other, the phylogenetic distance spanned by random versus inferred transfer events was similar, suggesting that real transfer events occur most frequently between closely related organisms, but can span large phylogenetic distances as well. While most of the protein trees examined here were very similar to the reference supertree, requiring zero or one edit operations for reconciliation, some trees implied up to 40 transfer events within a single orthologous set of proteins. Conclusion: Since sequence trees typically have no implied root and may contain unresolved or multifurcating nodes, the strategy implemented in EEEP is the most appropriate for phylogenomic analyses. The high degree of consistency among inferred protein trees shows that vertical inheritance is the dominant pattern of evolution, at least for the set of organisms considered here. However, the edit paths inferred using EEEP suggest an important role for genetic transfer in the evolution of microbial genomes as well. © 2006Beiko and Hamilton; licensee BioMed Central Ltd."





Mihaela Baroni and
Mike Steel. Accumulation Phylogenies. In ACOM, Vol. 10(1):1930, 2006. Keywords: abstract network, from clusters, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, regular network. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.137.1960.
Toggle abstract
"We investigate the computational complexity of a new combinatorial problem of inferring a smallest possible multilabeled phylogenetic tree (MUL tree) which is consistent with each of the rooted triplets in a given set. We prove that even the restricted case of determining if there exists a MUL tree consistent with the input and having just one leaf duplication is NPhard. Furthermore, we show that the general minimization problem is NPhard to approximate within a ratio of n 1ε for any constant 0<ε≤1, where n denotes the number of distinct leaf labels in the input set, although a simple polynomialtime approximation algorithm achieves the approximation ratio n. We also provide an exact algorithm for the problem running in O *(7 n ) time and O *(3 n ) space. © 2009 SpringerVerlag Berlin Heidelberg."



Mihaela Baroni,
Stefan Grünewald,
Vincent Moulton and
Charles Semple. Bounding the number of hybridization events for a consistent evolutionary history. In JOMB, Vol. 51(2):171182, 2005. Keywords: agreement forest, bound, explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BGMS05.pdf.
Toggle abstract
"Evolutionary processes such as hybridisation, lateral gene transfer, and recombination are all key factors in shaping the structure of genes and genomes. However, since such processes are not always best represented by trees, there is now considerable interest in using more general networks instead. For example, in recent studies it has been shown that networks can be used to provide lower bounds on the number of recombination events and also for the number of lateral gene transfers that took place in the evolutionary history of a set of molecular sequences. In this paper we describe the theoretical performance of some related bounds that result when merging pairs of trees into networks. © SpringerVerlag 2005."



Magnus Bordewich and
Charles Semple. On the computational complexity of the rooted subtree prune and regraft distance. In ACOM, Vol. 8:409423, 2005. Keywords: agreement forest, from rooted trees, NP complete, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS04.pdf.
Toggle abstract
"The graphtheoretic operation of rooted subtree prune and regraft is increasingly being used as a tool for understanding and modelling reticulation events in evolutionary biology. In this paper, we show that computing the rooted subtree prune and regraft distance between two rooted binary phylogenetic trees on the same label set is NPhard. This resolves a longstanding open problem. Furthermore, we show that this distance is fixed parameter tractable when parameterised by the distance between the two trees."



Barbara R. Holland,
Frédéric Delsuc and
Vincent Moulton. Visualizing Conflicting Evolutionary Hypotheses in Large Collections of Trees: Using Consensus Networks to Study the Origins of Placentals and Hexapods. In Systematic Biology, Vol. 54(1):6676, 2005. Keywords: consensus. Note: http://halsde.archivesouvertes.fr/halsde00193050/fr/.
Toggle abstract
"Many phylogenetic methods produce large collections of trees as opposed to a single tree, which allows the exploration of support for various evolutionary hypotheses. However, to be useful, the information contained in large collections of trees should be summarized; frequently this is achieved by constructing a consensus tree. Consensus trees display only those signals that are present in a large proportion of the trees. However, by their very nature consensus trees require that any conflicts between the trees are necessarily disregarded. We present a method that extends the notion of consensus trees to allow the visualization of conflicting hypotheses in a consensus network. We demonstrate the utility of this method in highlighting differences amongst maximum likelihood bootstrap values and Bayesian posterior probabilities in the placental mammal phylogeny, and also in comparing the phylogenetic signal contained in amino acid versus nucleotide characters for hexapod monophyly. Copyright © Society of Systematic Biologists."



Martyn Kennedy,
Barbara R. Holland,
Russel D. Gray and
Hamish G. Spencer. Untangling Long Branches: Identifying Conflicting Phylogenetic Signals Using Spectral Analysis, NeighborNet, and Consensus Networks. In Systematic Biology, Vol. 54(4):620633, 2005. Keywords: abstract network, consensus, NeighborNet, phylogenetic network, phylogeny. Note: http://awcmee.massey.ac.nz/people/bholland/pdf/Kennedy_etal_2005.pdf.



Richard C. Winkworth,
David Bryant,
Peter J. Lockhart,
David Havell and
Vincent Moulton. Biogeographic Interpretation of Splits Graphs: Least Squares Optimization of Branch Lengths. In Systematic Biology, Vol. 54(1):5665, 2005. Keywords: abstract network, from distances, from network, phylogenetic network, phylogeny, reconstruction, split, split network. Note: http://www.math.auckland.ac.nz/~bryant/Papers/05Biogeographic.pdf.





David Bryant and
Vincent Moulton. NeighborNet: An Agglomerative Method for the Construction of Phylogenetic Networks. In MBE, Vol. 21(2):255265, 2004. Keywords: phylogenetic network, phylogeny, Program SplitsTree, reconstruction, split network. Note: http://www.math.auckland.ac.nz/~bryant/Papers/04NeighborNet.pdf.
Toggle abstract
"We present NeighborNet, a distance based method for constructing phylogenetic networks that is based on the NeighborJoining (NJ) algorithm of Saitou and Nei. NeighborNet provides a snapshot of the data that can guide more detailed analysis. Unlike split decomposition, NeighborNet scales well and can quickly produce detailed and informative networks for several hundred taxa. We illustrate the method by reanalyzing three published data sets: a collection of 110 highly recombinant Salmonella multilocus sequence typing sequences, the 135 "African Eve" human mitochondrial sequences published by Vigilant et al., and a collection of 12 Archeal chaperonin sequences demonstrating strong evidence for gene conversion. NeighborNet is available as part of the SplitsTree4 software package."



Mihaela Baroni,
Charles Semple and
Mike Steel. A framework for representing reticulate evolution. In ACOM, Vol. 8:398401, 2004. Keywords: explicit network, from clusters, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, regular network, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSS04.pdf.
Toggle abstract
"Acyclic directed graphs (ADGs) are increasingly being viewed as more appropriate for representing certain evolutionary relationships, particularly in biology, than rooted trees. In this paper, we develop a framework for the analysis of these graphs which we call hybrid phylogenies. We are particularly interested in the problem whereby one is given a set of phylogenetic trees and wishes to determine a hybrid phylogeny that 'embeds' each of these trees and which requires the smallest number of hybridisation events. We show that this quantity can be greatly reduced if additional species are involved, and investigate other combinatorial aspects of this and related questions."



Daniel H. Huson,
Tobias Dezulian,
Tobias Kloepper and
Mike Steel. Phylogenetic SuperNetworks from Partial Trees. In TCBB, Vol. 1(4):151158, 2004. Keywords: abstract network, from unrooted trees, phylogenetic network, phylogeny, Program SplitsTree, reconstruction, supernetwork. Note: http://hdl.handle.net/10092/3177.
Toggle abstract
"In practice, one is often faced with incomplete phylogenetic data, such as a collection of partial trees or partial splits. This paper poses the problem of Inferring a phylogenetic supernetwork from such data and provides an efficient algorithm for doing so, called the Zclosure method. Additionally, the questions of assigning lengths to the edges of the network and how to restrict the "dimensionality" of the network are addressed. Applications to a set of five published partial gene trees relating different fungal species and to six published partial gene trees relating different grasses illustrate the usefulness of the method and an experimental study confirms Its potential. The method Is implemented as a plugin for the program SplitsTree4. © 2004 IEEE."





Katharina Huber,
Michael Langton,
David Penny,
Vincent Moulton and
Mike Hendy. Spectronet: A package for computing spectra and median networks. In ABIO, Vol. 1(3):159161, 2004. Keywords: from splits, median network, phylogenetic network, phylogeny, Program Spectronet, software, split, visualization. Note: http://citeseer.ist.psu.edu/631776.html.
Toggle abstract
Spectronet is a package that uses various methods for exploring and visualising complex evolutionary signals. Given an alignment in NEXUS format, the package works by computing a collection of weighted splits or bipartitions of the taxa and then allows the user to interactively analyse the resulting collection using tools such as Lentoplots and median networks. The package is highly interactive and available for PCs.



Katharina Huber,
Vincent Moulton and
Charles Semple. Replacing cliques by stars in quasimedian graphs. In DAM, Vol. 143(13), 2004. Note: http://dx.doi.org/10.1016/j.dam.2004.03.002.
Toggle abstract
"For a multiset Σ of splits (bipartitions) of a finite set X, we introduce the multisplit graph G(Σ). This graph is a natural extension of the Buneman graph, Indeed, it is shown that several results pertaining to the Buneman graph extend to the multisplit graph. In addition, in case Σ is derived from a set ℛ of partitions of X by taking parts together with their complements, we show that the extremal instances where ℛ is either strongly compatible or strongly incompatible are equivalent to G(Σ) being either a tree or a Cartesian product of star trees, respectively. © 2004 Elsevier B.V. All rights reserved."



HansJürgen Bandelt,
Vincent Macaulay and
Martin Richards. Median networks: speedy construction and greedy reduction, one simulation, and two case studies from human mtDNA. In MPE, Vol. 16:828, 2000. Keywords: from sequences, from splits, median network, phylogenetic network, phylogeny, reconstruction. Note: http://www.stats.gla.ac.uk/~vincent/papers/speedy.pdf.
Toggle abstract
"Molecular data sets characterized by few phylogenetically informative characters with a broad spectrum of mutation rates, such as intraspecific controlregion sequence variation of human mitochondrial DNA (mtDNA), can be usefully visualized in the form of median networks. Here we provide a stepbystep guide to the construction of such networks by hand. We improve upon a previously implemented algorithm by outlining an efficient parametrized strategy amenable to large data sets, greedy reduction, which makes it possible to reconstruct some of the confounding recurrent mutations. This entails some postprocessing as well, which assists in capturing more parsimonious solutions. To simplify the creation of the resulting network by hand, we describe a speedy approach to network construction, based on a careful planning of the processing order. A coalescent simulation tailored to human mtDNA variation in Eurasia testifies to the usefulness of reduced median networks, while highlighting notorious problems faced by all phylogenetic methods in this context. Finally, we discuss two case studies involving the comparison of characters in the two hypervariable segments of the human mtDNA control region in the light of the worldwide controlregion sequence database, as well as additional restriction fragment length polymorphism information. We conclude that only a minority of the mutations that hit the second segment occur at sites that would have a mutation rate comparable to those at most sites in the first segment. Discarding the known 'noisy' sites of the second segment enhances the analysis. (C) 2000 Academic Press."



Katharina Huber,
Elizabeth E. Watson and
Mike Hendy. An Algorithm for Constructing Local Regions in a Phylogenetic Network. In MPE, Vol. 19(1):18, 2000. Keywords: abstract network, median network, phylogenetic network, phylogeny, reconstruction, split. Note: http://dx.doi.org/10.1006/mpev.2000.0891.
Toggle abstract
"The groupings of taxa in a phylogenetic tree cannot represent all the conflicting signals that usually occur among site patterns in aligned homologous genetic sequences. Hence a treebuilding program must compromise by reporting a subset of the patterns, using some discriminatory criterion. Thus, in the worst case, out of possibly a large number of equally good trees, only an arbitrarily chosen tree might be reported by the treebuilding program as" The Tree." This tree might then be used as a basis for phylogenetic conclusions. One strategy to represent conflicting patterns in the data is to construct a network. The Buneman graph is a theoretically very attractive example of such a network. In particular, a characterization for when this network will be a tree is known. Also the Buneman graph contains each of the most parsimonious trees indicated by the data. In this paper we describe a new method for constructing the Buneman graph that can be used for a generalization of Hadamard conjugation to networks. This new method differs from previous methods by allowing us to focus on local regions of the graph without having to first construct the full graph. The construction is illustrated by an example. © 2001 Academic Press."













HansJürgen Bandelt and
Andreas W. M. Dress. An order theoretic framework for overlapping clustering. In DM, Vol. 136(13):2137, 1994.
Toggle abstract
"Cluster analysis deals with procedures which  given a finite collection X of objects together with some kind of local dissimilarity information  identify those subcollections C of objects from X, called clusters, which exhibit a comparatively low degree of internal dissimilarity. In this note we study arbitrary mappings φ which assign to each subcollection A ⊆ X of objects its internal degree of dissimilarity φ (A), subject only to the natural condition that A ⊆ B ⊆ X implies φ (A) ̌ φ (B), and we analyse on a rather abstract, purely order theoretic level how assumptions concerning the way such a mapping φ might be constructed from local data (that is, data involving only a few objects at a time) influence the degree of overlapping observed within the resulting family of clusters,  and vice versa. Hence, unlike previous order theoretic approaches to cluster analysis, we do not restrict our attention to nonoverlapping, hierarchical clustering. Instead, we regard a dissimilarity function φ as an arbitrary isotone mapping from a finite partially ordered set I  e.g. the set P(X) of all subsets A of a finite set X  into a (partially) ordered set R  e.g. the nonnegative real numbers  and we study the correspondence between the two subsets C(φ) and D(φ) of I, formed by the elements whose images are inaccessible from above and from below, respectively. While D(φ) constitutes the local data structure from which φ can be built up, C(φ) embodies the family of clusters associated with φ. Our results imply that in case I: = P(X) and R: = R≥0 one has # D ̌ n for all Dε{lunate}D(φ) and some fixed nε{lunate}N if and only if{A figure is presented} for all C0,..., Cnε{lunate}C(φ) if and only if this holds for all subsets C0,..., Cn ⊆ X, generalizing a wellknown criterion for nconformity of hypergraphs as well as corresponding results due to Batbedat, dealing with the case n = 2. © 1994."



HansJürgen Bandelt and
Andreas W. M. Dress. A canonical decomposition theory for metrics on a finite set. In Advances in Mathematics, Vol. 92(1):47105, 1992. Keywords: abstract network, circular split system, from distances, split, split decomposition, split network, weak hierarchy, weakly compatible.
Toggle abstract
"We consider specific additive decompositions d = d1 + ... + dn of metrics, defined on a finite set X (where a metric may give distance zero to pairs of distinct points). The simplest building stones are the slit metrics, associated to splits (i.e., bipartitions) of the given set X. While an additive decomposition of a Hamming metric into split metrics is in no way unique, we achieve uniqueness by restricting ourselves to coherent decompositions, that is, decompositions d = d1 + ... + dn such that for every map f:X → R with f(x) + f(y) ≥ d(x, y) for all x, y ε{lunate} X there exist maps f1, ..., fn: X → R with f = f1 + ... + fn and fi(x) + fi(y) ≥ di(x, y) for all i = 1,..., n and all x, y ε{lunate} X. These coherent decompositions are closely related to a geometric decomposition of the injective hull of the given metric. A metric with a coherent decomposition into a (weighted) sum of split metrics will be called totally splitdecomposable. Tree metrics (and more generally, the sum of two tree metrics) are particular instances of totally splitdecomposable metrics. Our main result confirms that every metric admits a coherent decomposition into a totally splitdecomposable metric and a splitprime residue, where all the split summands and hence the decomposition can be determined in polynomial time, and that a family of splits can occur this way if and only if it does not induce on any fourpoint subset all three splits with block size two. © 1992."





HansJürgen Bandelt and
Andreas W. M. Dress. Weak hierarchies associated with similarity measures: an additive clustering technique. In BMB, Vol. 51:113166, 1989. Keywords: abstract network, clustering, from distances, from trees, phylogenetic network, phylogeny, Program WeakHierarchies, reconstruction, weak hierarchy. Note: http://dx.doi.org/10.1007/BF02458841.
Toggle abstract
"A new and apparently rather useful and natural concept in cluster analysis is studied: given a similarity measure on a set of objects, a subset is regarded as a cluster if any two objects a, b inside this subset have greater similarity than any third object outside has to at least one of a, b. These clusters then form a closure system which can be described as a hypergraph without triangles. Conversely, given such a system, one may attach some weight to each cluster and then compose a similarity measure additively, by letting the similarity of a pair be the sum of weights of the clusters containing that particular pair. The original clusters can be reconstructed from the obtained similarity measure. This clustering model is thus located between the general additive clustering model of Shepard and Arabie (1979) and the standard hierarchical model. Potential applications include fitting dendrograms with few additional nonnested clusters and simultaneous representation of some families of multiple dendrograms (in particular, twodendrogram solutions), as well as assisting the search for phylogenetic relationships by proposing a somewhat larger system of possibly relevant "family groups", from which an appropriate choice (based on additional insight or individual preferences) remains to be made. © 1989 Society for Mathematical Biology."


