
Mihaela Baroni,
Stefan Grünewald,
Vincent Moulton and
Charles Semple. Bounding the number of hybridization events for a consistent evolutionary history. In JOMB, Vol. 51(2):171182, 2005. Keywords: agreement forest, bound, explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BGMS05.pdf.
Toggle abstract
"Evolutionary processes such as hybridisation, lateral gene transfer, and recombination are all key factors in shaping the structure of genes and genomes. However, since such processes are not always best represented by trees, there is now considerable interest in using more general networks instead. For example, in recent studies it has been shown that networks can be used to provide lower bounds on the number of recombination events and also for the number of lateral gene transfers that took place in the evolutionary history of a set of molecular sequences. In this paper we describe the theoretical performance of some related bounds that result when merging pairs of trees into networks. © SpringerVerlag 2005."



Magnus Bordewich,
Simone Linz,
Katherine St. John and
Charles Semple. A reduction algorithm for computing the hybridization number of two trees. In EBIO, Vol. 3:8698, 2007. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNumber. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BLSS07.pdf.



Magnus Bordewich and
Charles Semple. Computing the minimum number of hybridization events for a consistent evolutionary history. In DAM, Vol. 155:914918, 2007. Keywords: agreement forest, approximation, APX hard, explicit network, from rooted trees, hybridization, inapproximability, NP complete, phylogenetic network, phylogeny, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS06a.pdf.





Mihaela Baroni,
Charles Semple and
Mike Steel. A framework for representing reticulate evolution. In ACOM, Vol. 8:398401, 2004. Keywords: explicit network, from clusters, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, regular network, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSS04.pdf.
Toggle abstract
"Acyclic directed graphs (ADGs) are increasingly being viewed as more appropriate for representing certain evolutionary relationships, particularly in biology, than rooted trees. In this paper, we develop a framework for the analysis of these graphs which we call hybrid phylogenies. We are particularly interested in the problem whereby one is given a set of phylogenetic trees and wishes to determine a hybrid phylogeny that 'embeds' each of these trees and which requires the smallest number of hybridisation events. We show that this quantity can be greatly reduced if additional species are involved, and investigate other combinatorial aspects of this and related questions."



Olivier Gauthier and
FrançoisJoseph Lapointe. Hybrids and Phylogenetics Revisited: A Statistical Test of Hybridization Using Quartets. In Systematic Botany, Vol. 32(1):815, 2007. Keywords: explicit network, from quartets, hybridization, phylogenetic network, phylogeny, reconstruction, reticulogram, split decomposition. Note: http://dx.doi.org/10.1600/036364407780360238.
Toggle abstract
"The occurrence of reticulations in the evolutionary history of species poses serious challenges for all modern practitioners of phylogenetic analysis. Such events, including hybridization, introgression, and lateral gene transfer, lead to evolutionary histories that cannot be adequately represented in the form of phylogenetic trees. Although numerous methods that allow for the reconstruction of phylogenetic networks have been proposed in recent years, the detection of reticulations still remains problematic. In this paper we present a Hybrid Detection Criterion (HDC) along with a statistical procedure that allows for the identification of hybrid taxa. The test assesses whether a putative hybrid is consistently intermediate between its postulated parents, with respect to the other taxa. The performance of the statistical method is evaluated using known hybrids of the genus Aphelandra (Acanthaceae) using two network methods: reticulograms and split decomposition graphs. Our results indicate that the HDC test is reliable when used jointly with split decomposition. On the other hand, the test lacks power and provides misleading results when using reticulograms. We then show how the procedure can be used as a tool to identify putative hybrids. © Copyright 2007 by the American Society of Plant Taxonomists."





Monique M. Morin and
Bernard M. E. Moret. NetGen: generating phylogenetic networks with diploid hybrids. In BIO, Vol. 22(15):19211923, 2006. Keywords: generation, hybridization, Program NetGen, software. Note: http://dx.doi.org/10.1093/bioinformatics/btl191.
Toggle abstract
"Summary: NetGen is an eventdriven simulator that creates phylogenetic networks by extending the birthdeath model to include diploid hybridizations. DNA sequences are evolved in conjunction with the topology, enabling hybridization decisions to be based on contemporary evolutionary distances. NetGen supports variable rate lineages, root sequence specification, outgroup generation and many other options. This note describes the NetGen application and proposes an extension of the Newick format to accommodate phylogenetic networks. © 2006 Oxford University Press."









Barbara R. Holland,
Steffi Benthin,
Peter J. Lockhart,
Vincent Moulton and
Katharina Huber. Using supernetworks to distinguish hybridization from lineagesorting. In BMCEB, Vol. 8(202), 2008. Keywords: explicit network, from unrooted trees, hybridization, lineage sorting, phylogenetic network, phylogeny, reconstruction, supernetwork. Note: http://dx.doi.org/10.1186/147121488202.
Toggle abstract
"Background. A simple and widely used approach for detecting hybridization in phylogenies is to reconstruct gene trees from independent gene loci, and to look for gene tree incongruence. However, this approach may be confounded by factors such as poor taxonsampling and/or incomplete lineagesorting. Results. Using coalescent simulations, we investigated the potential of supernetwork methods to differentiate between gene tree incongruence arising from taxon sampling and incomplete lineagesorting as opposed to hybridization. For few hybridization events, a large number of independent loci, and wellsampled taxa across these loci, we found that it was possible to distinguish incomplete lineagesorting from hybridization using the filtered Zclosure and Qimputation supernetwork methods. Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that countbased filtering was the most effective filtering technique. Conclusion. Filtered supernetworks provide a tool for detecting and identifying hybridization events in phylogenies, a tool that should become increasingly useful in light of current genome sequencing initiatives and the ease with which large numbers of independent gene loci can be determined using new generation sequencing technologies. © 2008 Holland et al; licensee BioMed Central Ltd."



Josh Voorkamp né Collins,
Simone Linz and
Charles Semple. Quantifying hybridization in realistic time. In JCB, Vol. 18(10):13051318, 2011. Keywords: explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, software. Note: http://wwwcsif.cs.ucdavis.edu/~linzs/CLS10_interleave.pdf, software available at http://www.math.canterbury.ac.nz/~c.semple/software.shtml.
Toggle abstract
"Recently, numerous practical and theoretical studies in evolutionary biology aim at calculating the extent to which reticulationfor example, horizontal gene transfer, hybridization, or recombinationhas influenced the evolution for a set of presentday species. It has been shown that inferring the minimum number of hybridization events that is needed to simultaneously explain the evolutionary history for a set of trees is an NPhard and also fixedparameter tractable problem. In this article, we give a new fixedparameter algorithm for computing the minimum number of hybridization events for when two rooted binary phylogenetic trees are given. This newly developed algorithm is based on interleavinga technique using repeated kernelization steps that are applied throughout the exhaustive search part of a fixedparameter algorithm. To show that our algorithm runs efficiently to be applicable to a wide range of practical problem instances, we apply it to a grass data set and highlight the significant improvements in terms of running times in comparison to an algorithm that has previously been implemented. © 2011, Mary Ann Liebert, Inc."









David A. Morrison. Using datadisplay networks for exploratory data analysis in phylogenetic studies. In MBE, Vol. 27(5):10441057, 2010. Keywords: abstract network, hybridization, NeighborNet, Program SplitsTree, recombination, split decomposition. Note: http://dx.doi.org/10.1093/molbev/msp309.
Toggle abstract
"Exploratory data analysis (EDA) is a frequently undervalued part of data analysis in biology. It involves evaluating the characteristics of the data "before" proceeding to the definitive analysis in relation to the scientific question at hand. For phylogenetic analyses, a useful tool for EDA is a datadisplay network. This type of network is designed to display any character (or tree) conflict in a data set, without prior assumptions about the causes of those conflicts. The conflicts might be caused by 1) methodological issues in data collection or analysis, 2) homoplasy, or 3) horizontal gene flow of some sort. Here, I explore 13 published data sets using splits networks, as examples of using datadisplay networks for EDA. In each case, I performed an original EDA on the data provided, to highlight the aspects of the resulting network that will be important for an interpretation of the phylogeny. In each case, there is at least one important point (possibly missed by the original authors) that might affect the phylogenetic analysis. I conclude that EDA should play a greater role in phylogenetic analyses than it has done. © 2010 The Author. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."



Simone Linz,
Charles Semple and
Tanja Stadler. Analyzing and reconstructing reticulation networks under timing constraints. In JOMB, Vol. 61(5):715737, 2010. Keywords: explicit network, from rooted trees, hybridization, lateral gene transfer, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://dx.doi.org/10.1007/s002850090319y..
Toggle abstract
"Reticulation networks are now frequently used to model the history of life for various groups of species whose evolutionary past is likely to include reticulation events such as horizontal gene transfer or hybridization. However, the reconstructed networks are rarely guaranteed to be temporal. If a reticulation network is temporal, then it satisfies the two biologically motivated timing constraints of instantaneously occurring reticulation events and successively occurring speciation events. On the other hand, if a reticulation network is not temporal, it is always possible to make it temporal by adding a number of additional unsampled or extinct taxa. In the first half of the paper, we show that deciding whether a given number of additional taxa is sufficient to transform a nontemporal reticulation network into a temporal one is an NPcomplete problem. As one is often given a set of gene trees instead of a network in the context of hybridization, this motivates the second half of the paper which provides an algorithm, called TemporalHybrid, for reconstructing a temporal hybridization network that simultaneously explains the ancestral history of two trees or indicates that no such network exists. We further derive two methods to decide whether or not a temporal hybridization network exists for two given trees and illustrate one of the methods on a grass data set. © 2009 The Author(s)."



Laura S. Kubatko. Identifying Hybridization Events in the Presence of Coalescence via Model Selection. In Systematic Biology, Vol. 58(5):478488, 2009. Keywords: AIC, BIC, branch length, coalescent, explicit network, from rooted trees, from species tree, hybridization, lineage sorting, model selection, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1093/sysbio/syp055.



Tao Sang and
Yang Zhong. Testing Hybridization Hypotheses Based on Incongruent Gene Trees. In Systematic Biology, Vol. 49(3):422434, 2000. Keywords: bootstrap, from rooted trees, hybridization, lateral gene transfer, lineage sorting, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://dx.doi.org/10.1080/10635159950127321.



Chen Meng and
Laura S. Kubatko. Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: A model. In Theoretical Population Biology, Vol. 75(1):3545, 2009. Keywords: bayesian, coalescent, from network, from rooted trees, hybridization, likelihood, lineage sorting, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1016/j.tpb.2008.10.004.
Toggle abstract
"The application of phylogenetic inference methods, to data for a set of independent genes sampled randomly throughout the genome, often results in substantial incongruence in the singlegene phylogenetic estimates. Among the processes known to produce discord between singlegene phylogenies, two of the best studied in a phylogenetic context are hybridization and incomplete lineage sorting. Much recent attention has focused on the development of methods for estimating species phylogenies in the presence of incomplete lineage sorting, but phylogenetic models that allow for hybridization have been more limited. Here we propose a model that allows incongruence in singlegene phylogenies to be due to both hybridization and incomplete lineage sorting, with the goal of determining the contribution of hybridization to observed gene tree incongruence in the presence of incomplete lineage sorting. Using our model, we propose methods for estimating the extent of the role of hybridization in both a likelihood and a Bayesian framework. The performance of our methods is examined using both simulated and empirical data. © 2008 Elsevier Inc. All rights reserved."



Mark T. Holder,
Jennifer A. Anderson and
Alisha K. Holloway. Difficulties in Detecting Hybridization. In Systematic Biology, Vol. 50(6):978982, 2001. Keywords: bootstrap, from rooted trees, hybridization, lateral gene transfer, lineage sorting, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://dx.doi.org/10.1080/106351501753462911.
Toggle abstract
[No abstract available]



ZhiZhong Chen and
Lusheng Wang. HybridNET: a tool for constructing hybridization networks. In BIO, Vol. 26(22):29122913, 2010. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNET, software. Note: http://rnc.r.dendai.ac.jp/~chen/papers/note2.pdf.
Toggle abstract
"Motivations: When reticulation events occur, the evolutionary history of a set of existing species can be represented by a hybridization network instead of an evolutionary tree. When studying the evolutionary history of a set of existing species, one can obtain a phylogenetic tree of the set of species with high confidence by looking at a segment of sequences or a set of genes. When looking at another segment of sequences, a different phylogenetic tree can be obtained with high confidence too. This indicates that reticulation events may occur. Thus, we have the following problem: given two rooted phylogenetic trees on a set of species that correctly represent the treelike evolution of different parts of their genomes, what is the hybridization network with the smallest number of reticulation events to explain the evolution of the set of species under consideration? Results: We develop a program, named HybridNet, for constructing a hybridization network with the minimum number of reticulate vertices from two input trees. We first implement the O(3dn)time algorithm by Whidden et al. for computing a maximum (acyclic) agreement forest. Our program can output all the maximum (acyclic) agreement forests. We then augment the program so that it can construct an optimal hybridization network for each given maximum acyclic agreement forest. To our knowledge, this is the first time that optimal hybridization networks can be rapidly constructed. © The Author 2010. Published by Oxford University Press. All rights reserved."



Yun Yu,
Cuong Than,
James H. Degnan and
Luay Nakhleh. Coalescent Histories on Phylogenetic Networks and Detection of Hybridization Despite Incomplete Lineage Sorting. In Systematic Biology, Vol. 60(2):138149, 2011. Keywords: coalescent, hybridization, lineage sorting, reconstruction, statistical model. Note: http://www.cs.rice.edu/~nakhleh/Papers/YuEtAlSB11.pdf.
Toggle abstract
"Analyses of the increasingly available genomic data continue to reveal the extent of hybridization and its role in the evolutionary diversification of various groups of species. We show, through extensive coalescentbased simulations of multilocus data sets on phylogenetic networks, how divergence times before and after hybridization events can result in incomplete lineage sorting with gene tree incongruence signatures identical to those exhibited by hybridization. Evolutionary analysis of such data under the assumption of a species tree model can miss all hybridization events, whereas analysis under the assumption of a species network model would grossly overestimate hybridization events. These issues necessitate a paradigm shift in evolutionary analysis under these scenarios, from a model that assumes a priori a single source of gene tree incongruence to one that integrates multiple sources in a unifying framework. We propose a framework of coalescence within the branches of a phylogenetic network and show how this framework can be used to detect hybridization despite incomplete lineage sorting. We apply the model to simulated data and show that the signature of hybridization can be revealed as long as the interval between the divergence times of the species involved in hybridization is not too small. We reanalyze a data set of 106 loci from 7 ingroup Saccharomyces species for which a species tree with no hybridization has been reported in the literature. Our analysis supports the hypothesis that hybridization occurred during the evolution of this group, explaining a large amount of the incongruence in the data. Our findings show that an integrative approach to gene tree incongruence and its reconciliation is needed. Our framework will help in systematically analyzing genomic data for the occurrence of hybridization and elucidating its evolutionary role. [Coalescent history; incomplete lineage sorting; hybridization; phylogenetic network.]. © 2011 The Author(s)."



Simon Joly,
Patricia A. McLenachan and
Peter J. Lockhart. A Statistical Approach for Distinguishing Hybridization and Incomplete Lineage Sorting. In The American Naturalist, Vol. 174(2):E54E70, 2009. Keywords: hybridization, lineage sorting, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://www.plantevolution.org/pdf/Joly&al_2009_AmNat.pdf.
Toggle abstract
"The extent and evolutionary significance of hybridization is difficult to evaluate because of the difficulty in distinguishing hybridization from incomplete lineage sorting. Here we present a novel parametric approach for statistically distinguishing hybridization from incomplete lineage sorting based on minimum genetic distances of a nonrecombining locus. It is based on the idea that the expected minimum genetic distance between sequences from two species is smaller for some hybridization events than for incomplete lineage sorting scenarios. When applied to empirical data sets, distributions can be generated for the minimum interspecies distances expected under incomplete lineage sorting using coalescent simulations. If the observed distance between sequences from two species is smaller than its predicted distribution, incomplete lineage sorting can be rejected and hybridization inferred. We demonstrate the power of the method using simulations and illustrate its application on New Zealand alpine buttercups (Ranunculus). The method is robust and complements existing approaches. Thus it should allow biologists to assess with greater accuracy the importance of hybridization in evolution. © 2009 by The University of Chicago."



Simon Joly. JML: Testing hybridization from species trees. In Molecular Ecology Ressources, Vol. 12(1):179184, 2012. Keywords: from species tree, hybridization, lineage sorting, phylogenetic network, phylogeny, Program JML, statistical model. Note: http://www.plantevolution.org/pdf/JMLpaper_accepted.pdf.
Toggle abstract
"I introduce the software jml that tests for the presence of hybridization in multispecies sequence data sets by posterior predictive checking following Joly, McLenachan and Lockhart (2009, American Naturalist e54). Although their method could potentially be applied on any data set, the lack of appropriate software made its application difficult. The software jml thus fills a need for an easy application of the method but also includes improvements such as the possibility to incorporate uncertainty in the species tree topology. The jml software uses a posterior distribution of species trees, population sizes and branch lengths to simulate replicate sequence data sets using the coalescent with no migration. A test quantity, defined as the minimum pairwise sequence distance between sequences of two species, is then evaluated on the simulated data sets and compared to the one estimated from the original data. Because the test quantity is a good predictor of hybridization events, departure from the bifurcating species tree model could be interpreted as evidence of hybridization. Software performance in terms of computing time is evaluated for several parameters. I also show an application example of the software for detecting hybridization among native diploid North American roses. © 2011 Blackwell Publishing Ltd."



Yun Yu,
James H. Degnan and
Luay Nakhleh. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. In PLoS Genetics, Vol. 8(4):e1002660, 2012. Keywords: AIC, BIC, explicit network, hybridization, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1371/journal.pgen.1002660.
Toggle abstract
"Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. © 2012 Yu et al."



Chris Whidden,
Robert G. Beiko and
Norbert Zeh. FixedParameter Algorithms for Maximum Agreement Forests. In SICOMP, Vol. 42(4):14311466, 2013. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: http://arxiv.org/abs/1108.2664, slides.
Toggle abstract
"We present new and improved fixedparameter algorithms for computing maximum agreement forests of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree pruneandregraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution. Our algorithm for computing maximum acyclic agreement forests is the first depthbounded search algorithm for this problem. Our algorithms substantially outperform the best previous algorithms for these problems. © 2013 Society for Industrial and Applied Mathematics."



Peter J. Humphries,
Simone Linz and
Charles Semple. On the complexity of computing the temporal hybridization number for two phylogenies. In DAM, Vol. 161:871880, 2013. Keywords: agreement forest, APX hard, characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/TAFapx.pdf.
Toggle abstract
"Phylogenetic networks are now frequently used to explain the evolutionary history of a set of species for which a collection of gene trees, reconstructed from genetic material of different parts of the species' genomes, reveal inconsistencies. However, in the context of hybridization, the reconstructed networks are often not temporal. If a hybridization network is temporal, then it satisfies the time constraint of instantaneously occurring hybridization events; i.e. all species that are involved in such an event coexist in time. Furthermore, although a collection of phylogenetic trees can often be merged into a hybridization network that is temporal, many algorithms do not necessarily find such a network since their primary optimization objective is to minimize the number of hybridization events. In this paper, we present a characterization for when two rooted binary phylogenetic trees admit a temporal hybridization network. Furthermore, we show that the underlying optimization problem is APXhard and, therefore, NPhard. Thus, unless P=NP, it is unlikely that there are efficient algorithms for either computing an exact solution or approximating it within a ratio arbitrarily close to one. © 2012 Elsevier B.V. All rights reserved."



Leo van Iersel,
Steven Kelk,
Nela Lekic and
Leen Stougie. Approximation algorithms for nonbinary agreement forests. In SIDMA, Vol. 28(1):4966, 2014. Keywords: agreement forest, approximation, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1210.3211.
Toggle abstract
"Given two rooted phylogenetic trees on the same set of taxa X, the Maximum Agreement Forest (maf) problem asks to find a forest that is, in a certain sense, common to both trees and has a minimum number of components. The Maximum Acyclic Agreement Forest (maaf) problem has the additional restriction that the components of the forest cannot have conflicting ancestral relations in the input trees. There has been considerable interest in the special cases of these problems in which the input trees are required to be binary. However, in practice, phylogenetic trees are rarely binary, due to uncertainty about the precise order of speciation events. Here, we show that the general, nonbinary version of maf has a polynomialtime 4approximation and a fixedparameter tractable (exact) algorithm that runs in O(4opoly(n)) time, where n = X and k is the number of components of the agreement forest minus one. Moreover, we show that a capproximation algorithm for nonbinary maf and a dapproximation algorithm for the classical problem Directed Feedback Vertex Set (dfvs) can be combined to yield a d(c+3)approximation for nonbinary maaf. The algorithms for maf have been implemented and made publicly available. © 2014 Society for Industrial and Applied Mathematics."



Yun Yu,
R. Matthew Barnett and
Luay Nakhleh. Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting. In Systematic Biology, Vol. 62(5):738751, 2013. Keywords: from network, from rooted trees, hybridization, lineage sorting, parsimony, phylogenetic network, phylogeny, Program PhyloNet, reconstruction.
Toggle abstract
"Hybridization plays an important evolutionary role in several groups of organisms. A phylogenetic approach to detect hybridization entails sequencing multiple loci across the genomes of a group of species of interest, reconstructing their gene trees, and taking their differences as indicators of hybridization. However, methods that follow this approach mostly ignore population effects, such as incomplete lineage sorting (ILS). Given that hybridization occurs between closely related organisms, ILS may very well be at play and, hence, must be accounted for in the analysis framework. To address this issue, we present a parsimony criterion for reconciling gene trees within the branches of a phylogenetic network, and a local search heuristic for inferring phylogenetic networks from collections of genetree topologies under this criterion. This framework enables phylogenetic analyses while accounting for both hybridization and ILS. Further, we propose two techniques for incorporating information about uncertainty in genetree estimates. Our simulation studies demonstrate the good performance of our framework in terms of identifying the location of hybridization events, as well as estimating the proportions of genes that underwent hybridization. Also, our framework shows good performance in terms of efficiency on handling large data sets in our experiments. Further, in analysing a yeast data set, we demonstrate issues that arise when analysing real data sets. Although a probabilistic approach was recently introduced for this problem, and although parsimonious reconciliations have accuracy issues under certain settings, our parsimony framework provides a much more computationally efficient technique for this type of analysis. Our framework now allows for genomewide scans for hybridization, while also accounting for ILS. [Phylogenetic networks; hybridization; incomplete lineage sorting; coalescent; multilabeled trees.] © 2013 The Author(s). All rights reserved."



Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):18791890, 2013. Keywords: characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimumtemporalhybridization number for a set P of rooted binary phylogenetic trees is NPhard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimumtemporal hybridization number for two trees is fixedparameter tractable. © 2013 Society for Mathematical Biology."



Leo van Iersel,
Steven Kelk,
Nela Lekic,
Chris Whidden and
Norbert Zeh. Hybridization Number on Three Rooted Binary Trees is EPT. In SIDMA, Vol. 30(3):16071631, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1402.2136.



Steven Kelk,
Leo van Iersel,
Celine Scornavacca and
Mathias Weller. Phylogenetic incongruence through the lens of Monadic Second Order logic. In JGAA, Vol. 20(2):189215, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, MSOL, phylogenetic network, phylogeny, reconstruction. Note: http://jgaa.info/accepted/2016/KelkIerselScornavaccaWeller2016.20.2.pdf.















Yufeng Wu. Close Lower and Upper Bounds for the Minimum Reticulate Network of Multiple Phylogenetic Trees. In ISMB10, Vol. 26(12):i140i148 of BIO, 2010. Keywords: explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program PIRN, software. Note: http://dx.doi.org/10.1093/bioinformatics/btq198.
Toggle abstract
"Motivation: Reticulate network is a model for displaying and quantifying the effects of complex reticulate processes on the evolutionary history of species undergoing reticulate evolution. A central computational problem on reticulate networks is: given a set of phylogenetic trees (each for some region of the genomes), reconstruct the most parsimonious reticulate network (called the minimum reticulate network) that combines the topological information contained in the given trees. This problem is wellknown to be NPhard. Thus, existing approaches for this problem either work with only two input trees or make simplifying topological assumptions. Results: We present novel results on the minimum reticulate network problem. Unlike existing approaches, we address the fully general problem: there is no restriction on the number of trees that are input, and there is no restriction on the form of the allowed reticulate network. We present lower and upper bounds on the minimum number of reticulation events in the minimum reticulate network (and infer an approximately parsimonious reticulate network). A program called PIRN implements these methods, which also outputs a graphical representation of the inferred network. Empirical results on simulated and biological data show that our methods are practical for a wide range of data. More importantly, the lower and upper bounds match for many datasets (especially when the number of trees is small or reticulation level is low), and this allows us to solve the minimum reticulate network problem exactly for these datasets. Availability: A software tool, PIRN, is available for download from the web page: http://www.engr.uconn.edu/ywu. Contact: ywu@engr.uconn.edu. Supplementary information: Supplementary data is available at Bioinformatics online. © The Author(s) 2010. Published by Oxford University Press."



Yufeng Wu and
Jiayin Wang. Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees. In ISBRA10, Vol. 6053:203214 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, from rooted trees, hybridization, integer linear programming, minimum number, phylogenetic network, phylogeny, Program HybridNumber, Program SPRDist, SPR distance. Note: http://www.engr.uconn.edu/~ywu/Papers/ISBRA10WuWang.pdf.
Toggle abstract
"Hybridization is a reticulate evolutionary process. An established problem on hybridization is computing the minimum number of hybridization events, called the hybridization number, needed in the evolutionary history of two phylogenetic trees. This problem is known to be NPhard. In this paper, we present a new practical method to compute the exact hybridization number. Our approach is based on an integer linear programming formulation. Simulation results on biological and simulated datasets show that our method (as implemented in program SPRDist) is more efficient and robust than an existing method. © 2010 SpringerVerlag Berlin Heidelberg."



Chris Whidden,
Robert G. Beiko and
Norbert Zeh. Fast FPT Algorithms for Computing Rooted Agreement Forests: Theory and Experiments. In Proceedings of the ninth International Symposium on Experimental Algorithms (SEA'10), Vol. 6049:141153 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: https://www.cs.dal.ca/sites/default/files/technical_reports/CS201003.pdf.
Toggle abstract
"We improve on earlier FPT algorithms for computing a rooted maximum agreement forest (MAF) or a maximum acyclic agreement forest (MAAF) of a pair of phylogenetic trees. Their sizes give the subtreepruneandregraft (SPR) distance and the hybridization number of the trees, respectively. We introduce new branching rules that reduce the running time of the algorithms from O(3 kn) and O(3 kn log n) to O(2.42 kn) and O(2.42 kn log n), respectively. In practice, the speed up may be much more than predicted by the worstcase analysis.We confirm this intuition experimentally by computing MAFs for simulated trees and trees inferred from protein sequence data. We show that our algorithm is orders of magnitude faster and can handle much larger trees and SPR distances than the best previous methods, treeSAT and sprdist. © SpringerVerlag Berlin Heidelberg 2010."



Leo van Iersel,
Steven Kelk,
Nela Lekic and
Celine Scornavacca. A practical approximation algorithm for solving massive instances of hybridization number. In WABI12, Vol. 7534(430440) of LNCS, springer, 2012. Keywords: agreement forest, approximation, explicit network, from rooted trees, hybridization, phylogenetic network, phylogeny, Program CycleKiller, Program Dendroscope, Program HybridNET, reconstruction, software. Note: http://arxiv.org/abs/1205.3417.
Toggle abstract
"Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. In practice, exact solvers struggle to solve instances with reticulation number larger than 40. For such instances, one has to resort to heuristics and approximation algorithms. Here we present the algorithm CycleKiller which is the first approximation algorithm that can produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations. Theoretically, the algorithm is an exponentialtime 2approximation (or 4approximation in its fastest mode). However, using simulations we demonstrate that in practice the algorithm runs quickly for large and difficult instances, producing solutions within one percent of optimality. An implementation of this algorithm, which extends the theoretical work of [14], has been made publicly available. © 2012 SpringerVerlag."



Yun Yu and
Luay Nakhleh. A maximum pseudolikelihood approach for phylogenetic networks. In RECOMBCG15, Vol. 16(Suppl 10)(S10):110 of BMC Genomics, BioMed Central, 2015. Keywords: explicit network, from rooted trees, hybridization, incomplete lineage sorting, likelihood, phylogenetic network, phylogeny, Program PhyloNet, reconstruction, tripartition distance. Note: http://dx.doi.org/10.1186/1471216416S10S10.



