
David A. Morrison. Using datadisplay networks for exploratory data analysis in phylogenetic studies. In MBE, Vol. 27(5):10441057, 2010. Keywords: abstract network, hybridization, NeighborNet, Program SplitsTree, recombination, split decomposition. Note: http://dx.doi.org/10.1093/molbev/msp309.
Toggle abstract
"Exploratory data analysis (EDA) is a frequently undervalued part of data analysis in biology. It involves evaluating the characteristics of the data "before" proceeding to the definitive analysis in relation to the scientific question at hand. For phylogenetic analyses, a useful tool for EDA is a datadisplay network. This type of network is designed to display any character (or tree) conflict in a data set, without prior assumptions about the causes of those conflicts. The conflicts might be caused by 1) methodological issues in data collection or analysis, 2) homoplasy, or 3) horizontal gene flow of some sort. Here, I explore 13 published data sets using splits networks, as examples of using datadisplay networks for EDA. In each case, I performed an original EDA on the data provided, to highlight the aspects of the resulting network that will be important for an interpretation of the phylogeny. In each case, there is at least one important point (possibly missed by the original authors) that might affect the phylogenetic analysis. I conclude that EDA should play a greater role in phylogenetic analyses than it has done. © 2010 The Author. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."



Simone Linz,
Charles Semple and
Tanja Stadler. Analyzing and reconstructing reticulation networks under timing constraints. In JOMB, Vol. 61(5):715737, 2010. Keywords: explicit network, from rooted trees, hybridization, lateral gene transfer, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://dx.doi.org/10.1007/s002850090319y..
Toggle abstract
"Reticulation networks are now frequently used to model the history of life for various groups of species whose evolutionary past is likely to include reticulation events such as horizontal gene transfer or hybridization. However, the reconstructed networks are rarely guaranteed to be temporal. If a reticulation network is temporal, then it satisfies the two biologically motivated timing constraints of instantaneously occurring reticulation events and successively occurring speciation events. On the other hand, if a reticulation network is not temporal, it is always possible to make it temporal by adding a number of additional unsampled or extinct taxa. In the first half of the paper, we show that deciding whether a given number of additional taxa is sufficient to transform a nontemporal reticulation network into a temporal one is an NPcomplete problem. As one is often given a set of gene trees instead of a network in the context of hybridization, this motivates the second half of the paper which provides an algorithm, called TemporalHybrid, for reconstructing a temporal hybridization network that simultaneously explains the ancestral history of two trees or indicates that no such network exists. We further derive two methods to decide whether or not a temporal hybridization network exists for two given trees and illustrate one of the methods on a grass data set. © 2009 The Author(s)."



Yufeng Wu. Close Lower and Upper Bounds for the Minimum Reticulate Network of Multiple Phylogenetic Trees. In ISMB10, Vol. 26(12):i140i148 of BIO, 2010. Keywords: explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program PIRN, software. Note: http://dx.doi.org/10.1093/bioinformatics/btq198.
Toggle abstract
"Motivation: Reticulate network is a model for displaying and quantifying the effects of complex reticulate processes on the evolutionary history of species undergoing reticulate evolution. A central computational problem on reticulate networks is: given a set of phylogenetic trees (each for some region of the genomes), reconstruct the most parsimonious reticulate network (called the minimum reticulate network) that combines the topological information contained in the given trees. This problem is wellknown to be NPhard. Thus, existing approaches for this problem either work with only two input trees or make simplifying topological assumptions. Results: We present novel results on the minimum reticulate network problem. Unlike existing approaches, we address the fully general problem: there is no restriction on the number of trees that are input, and there is no restriction on the form of the allowed reticulate network. We present lower and upper bounds on the minimum number of reticulation events in the minimum reticulate network (and infer an approximately parsimonious reticulate network). A program called PIRN implements these methods, which also outputs a graphical representation of the inferred network. Empirical results on simulated and biological data show that our methods are practical for a wide range of data. More importantly, the lower and upper bounds match for many datasets (especially when the number of trees is small or reticulation level is low), and this allows us to solve the minimum reticulate network problem exactly for these datasets. Availability: A software tool, PIRN, is available for download from the web page: http://www.engr.uconn.edu/ywu. Contact: ywu@engr.uconn.edu. Supplementary information: Supplementary data is available at Bioinformatics online. © The Author(s) 2010. Published by Oxford University Press."



Yufeng Wu and
Jiayin Wang. Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees. In ISBRA10, Vol. 6053:203214 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, from rooted trees, hybridization, integer linear programming, minimum number, phylogenetic network, phylogeny, Program HybridNumber, Program SPRDist, SPR distance. Note: http://www.engr.uconn.edu/~ywu/Papers/ISBRA10WuWang.pdf.
Toggle abstract
"Hybridization is a reticulate evolutionary process. An established problem on hybridization is computing the minimum number of hybridization events, called the hybridization number, needed in the evolutionary history of two phylogenetic trees. This problem is known to be NPhard. In this paper, we present a new practical method to compute the exact hybridization number. Our approach is based on an integer linear programming formulation. Simulation results on biological and simulated datasets show that our method (as implemented in program SPRDist) is more efficient and robust than an existing method. © 2010 SpringerVerlag Berlin Heidelberg."



Chris Whidden,
Robert G. Beiko and
Norbert Zeh. Fast FPT Algorithms for Computing Rooted Agreement Forests: Theory and Experiments. In Proceedings of the ninth International Symposium on Experimental Algorithms (SEA'10), Vol. 6049:141153 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: https://www.cs.dal.ca/sites/default/files/technical_reports/CS201003.pdf.
Toggle abstract
"We improve on earlier FPT algorithms for computing a rooted maximum agreement forest (MAF) or a maximum acyclic agreement forest (MAAF) of a pair of phylogenetic trees. Their sizes give the subtreepruneandregraft (SPR) distance and the hybridization number of the trees, respectively. We introduce new branching rules that reduce the running time of the algorithms from O(3 kn) and O(3 kn log n) to O(2.42 kn) and O(2.42 kn log n), respectively. In practice, the speed up may be much more than predicted by the worstcase analysis.We confirm this intuition experimentally by computing MAFs for simulated trees and trees inferred from protein sequence data. We show that our algorithm is orders of magnitude faster and can handle much larger trees and SPR distances than the best previous methods, treeSAT and sprdist. © SpringerVerlag Berlin Heidelberg 2010."



ZhiZhong Chen and
Lusheng Wang. HybridNET: a tool for constructing hybridization networks. In BIO, Vol. 26(22):29122913, 2010. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNET, software. Note: http://rnc.r.dendai.ac.jp/~chen/papers/note2.pdf.
Toggle abstract
"Motivations: When reticulation events occur, the evolutionary history of a set of existing species can be represented by a hybridization network instead of an evolutionary tree. When studying the evolutionary history of a set of existing species, one can obtain a phylogenetic tree of the set of species with high confidence by looking at a segment of sequences or a set of genes. When looking at another segment of sequences, a different phylogenetic tree can be obtained with high confidence too. This indicates that reticulation events may occur. Thus, we have the following problem: given two rooted phylogenetic trees on a set of species that correctly represent the treelike evolution of different parts of their genomes, what is the hybridization network with the smallest number of reticulation events to explain the evolution of the set of species under consideration? Results: We develop a program, named HybridNet, for constructing a hybridization network with the minimum number of reticulate vertices from two input trees. We first implement the O(3dn)time algorithm by Whidden et al. for computing a maximum (acyclic) agreement forest. Our program can output all the maximum (acyclic) agreement forests. We then augment the program so that it can construct an optimal hybridization network for each given maximum acyclic agreement forest. To our knowledge, this is the first time that optimal hybridization networks can be rapidly constructed. © The Author 2010. Published by Oxford University Press. All rights reserved."


