
Philippe Gambette,
Katharina Huber and
Guillaume Scholz. Uprooted Phylogenetic Networks. In BMB, Vol. 79(9):20222048, 2017. Keywords: circular split system, explicit network, from splits, galled tree, phylogenetic network, phylogeny, polynomial, reconstruction, split network, uniqueness. Note: http://arxiv.org/abs/1511.08387.



Leo van Iersel,
Vincent Moulton,
Eveline De Swart and
Taoyang Wu. Binets: fundamental building blocks for phylogenetic networks. In BMB, Vol. 79(5):11351154, 2017. Keywords: approximation, explicit network, from binets, galled tree, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1007/s1153801702754.



Philippe Gambette,
Leo van Iersel,
Steven Kelk,
Fabio Pardi and
Celine Scornavacca. Do branch lengths help to locate a tree in a phylogenetic network? In BMB, Vol. 78(9):17731795, 2016. Keywords: branch length, explicit network, FPT, from network, from rooted trees, NP complete, phylogenetic network, phylogeny, pseudopolynomial, time consistent network, tree containment, tree sibling network. Note: http://arxiv.org/abs/1607.06285.





Maria Anaya,
Olga AnipchenkoUlaj,
Aisha Ashfaq,
Joyce Chiu,
Mahedi Kaiser,
Max Shoji Ohsawa,
Megan Owen,
Ella Pavlechko,
Katherine St. John,
Shivam Suleria,
Keith Thompson and
Corrine Yap. On Determining if Treebased Networks Contain Fixed Trees. In BMB, Vol. 78(5):961969, 2016. Keywords: explicit network, FPT, NP complete, phylogenetic network, phylogeny, treebased network. Note: http://arxiv.org/abs/1602.02739.





Paul Cordue,
Simone Linz and
Charles Semple. Phylogenetic Networks that Display a Tree Twice. In BMB, Vol. 76(10):26642679, 2014. Keywords: from rooted trees, normal network, phylogenetic network, phylogeny, reconstruction, tree child network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/CLS14.pdf.
Toggle abstract
"In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the treelike information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratictime algorithm to solve this problem for a class of networks that is more general than treechild networks. © 2014, Society for Mathematical Biology."



Stephen J. Willson. Reconstruction of certain phylogenetic networks from their treeaverage distances. In BMB, Vol. 75(10):18401878, 2013. Keywords: explicit network, from distances, galled tree, normal network, phylogenetic network, phylogeny, unicyclic network. Note: http://www.public.iastate.edu/~swillson/TreeAverageReconPaper9.pdf.
Toggle abstract
"Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the treeaverage distances between the leaves. For a certain class of phylogenetic networks, a polynomialtime method is presented to reconstruct the network from the treeaverage distances. The method is proved to work if there is a single reticulation cycle. © 2013 Society for Mathematical Biology."



Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):18791890, 2013. Keywords: characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimumtemporalhybridization number for a set P of rooted binary phylogenetic trees is NPhard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimumtemporal hybridization number for two trees is fixedparameter tractable. © 2013 Society for Mathematical Biology."



Jeremy G. Sumner,
Barbara R. Holland and
Peter D. Jarvis. The algebra of the general Markov model on phylogenetic trees and networks. In BMB, Vol. 74(4):858880, 2012. Keywords: abstract network, phylogenetic network, phylogeny, split, split network, statistical model. Note: http://arxiv.org/abs/1012.5165.
Toggle abstract
"It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuoustime Markov chain together with the "splitting" operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications. © 2011 Society for Mathematical Biology."



Fenglou Mao,
David Williams,
Olga Zhaxybayeva,
Maria S. Poptsova,
Pascal Lapierre,
J. Peter Gogarten and
Ying Xu. Quartet decomposition server: a platform for analyzing phylogenetic trees. In BMCB, Vol. 13:123, 2012. Keywords: abstract network, from quartets, phylogenetic network, phylogeny, Program Quartet Decomposition, reconstruction, software, split network.
Toggle abstract
"Background: The frequent exchange of genetic material among prokaryotes means that extracting a majority or plurality phylogenetic signal from many gene families, and the identification of gene families that are in significant conflict with the plurality signal is a frequent task in comparative genomics, and especially in phylogenomic analyses. Decomposition of gene trees into embedded quartets (unrooted trees each with four taxa) is a convenient and statistically powerful technique to address this challenging problem. This approach was shown to be useful in several studies of completely sequenced microbial genomes.Results: We present here a web server that takes a collection of gene phylogenies, decomposes them into quartets, generates a Quartet Spectrum, and draws a split network. Users are also provided with various data download options for further analyses. Each gene phylogeny is to be represented by an assessment of phylogenetic information content, such as sets of trees reconstructed from bootstrap replicates or sampled from a posterior distribution. The Quartet Decomposition server is accessible at http://quartets.uga.edu.Conclusions: The Quartet Decomposition server presented here provides a convenient means to perform Quartet Decomposition analyses and will empower users to find statistically supported phylogenetic conflicts. © 2012 Mao et al.; licensee BioMed Central Ltd."



Stephen J. Willson. Restricted trees: simplifying networks with bottlenecks. In BMB, Vol. 73(10):23222338, 2011. Keywords: from network, phylogenetic network. Note: http://arxiv.org/abs/1005.4956.
Toggle abstract
"Suppose N is a phylogenetic network indicating a complicated relationship among individuals and taxa. Often of interest is a much simpler network, for example, a species tree T, that summarizes the most fundamental relationships. The meaning of a species tree is made more complicated by the recent discovery of the importance of hybridizations and lateral gene transfers. Hence, it is desirable to describe uniform welldefined procedures that yield a tree given a network N. A useful tool toward this end is a connected surjective digraph (CSD) map φ:N→N′ where N′ is generally a much simpler network than N. A set W of vertices in N is "restricted" if there is at most one vertex u∉W from which there is an arc into W, thus yielding a bottleneck in N. A CSD map φ:N→N′ is "restricted" if the inverse image of each vertex in N′ is restricted in N. This paper describes a uniform procedure that, given a network N, yields a welldefined tree called the "restricted tree" of N. There is a restricted CSD map from N to the restricted tree. Many relationships in the tree can be proved to appear also in N. © 2011 The Author(s)."



Stephen J. Willson. Properties of normal phylogenetic networks. In BMB, Vol. 72(2):340358, 2010. Keywords: normal network, phylogenetic network, phylogeny, regular network. Note: http://www.public.iastate.edu/~swillson/RestrictionsOnNetworkspap9.pdf, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/willson/.
Toggle abstract
"A phylogenetic network is a rooted acyclic digraph with vertices corresponding to taxa. Let X denote a set of vertices containing the root, the leaves, and all vertices of outdegree 1. Regard X as the set of vertices on which measurements such as DNA can be made. A vertex is called normal if it has one parent, and hybrid if it has more than one parent. The network is called normal if it has no redundant arcs and also from every vertex there is a directed path to a member of X such that all vertices after the first are normal. This paper studies properties of normal networks. Under a simple model of inheritance that allows homoplasies only at hybrid vertices, there is essentially unique determination of the genomes at all vertices by the genomes at members of X if and only if the network is normal. This model is a limiting case of more standard models of inheritance when the substitution rate is sufficiently low. Various mathematical properties of normal networks are described. These properties include that the number of vertices grows at most quadratically with the number of leaves and that the number of hybrid vertices grows at most linearly with the number of leaves. © 2009 Society for Mathematical Biology."



Leo van Iersel,
Charles Semple and
Mike Steel. Quantifying the Extent of Lateral Gene Transfer Required to Avert a 'Genome of Eden'. In BMB, Vol. 72:1783–1798, 2010. Note: http://www.win.tue.nl/~liersel/LGT.pdf.
Toggle abstract
"The complex pattern of presence and absence of many genes across different species provides tantalising clues as to how genes evolved through the processes of gene genesis, gene loss, and lateral gene transfer (LGT). The extent of LGT, particularly in prokaryotes, and its implications for creating a 'network of life' rather than a 'tree of life' is controversial. In this paper, we formally model the problem of quantifying LGT, and provide exact mathematical bounds, and new computational results. In particular, we investigate the computational complexity of quantifying the extent of LGT under the simple models of gene genesis, loss, and transfer on which a recent heuristic analysis of biological data relied. Our approach takes advantage of a relationship between LGT optimization and graphtheoretical concepts such as tree width and network flow. © 2010 Society for Mathematical Biology."



Maria S. Poptsova. Testing Phylogenetic Methods to Identify Horizontal Gene Transfer. In Horizontal Gene Transfer, Pages 227240, 2009. Note: http://dx.doi.org/10.1007/9781603278539_13.
Toggle abstract
"The subject of this chapter is to describe the methodology for assessing the power of phylogenetic HGT detection methods. Detection power is defined in the framework of hypothesis testing. Rates of false positives and false negatives can be estimated by testing HGT detection methods on HGTfree orthologous sets, and on the same sets with in silico simulated HGT events. The whole process can be divided into three steps: obtaining HGTfree orthologous sets, in silico simulation of HGT events in the same set, and submitting both sets for evaluation by any of the tested methods.Phylogenetic methods of HGT detection can be roughly divided into three types: likelihoodbased tests of topologies (KishinoHasegawa (KH), ShimodairaHasegawa (SH), and Approximately Unbiased (AU) tests), tree distance methods (symmetrical difference of Robinson and Foulds (RF), and Subtree Pruning and Regrafting (SPR) distances), and genome spectral approaches (bipartition and quartet decomposition analysis). Restrictions that are inherent to phylogenetic methods of HGT detection in general and the power and precision of each method are discussed and comparative analyses of different approaches are provided, as well as some examples of assessing the power of phylogenetic HGT detection methods from a case study of orthologous sets from gammaproteobacteria (Poptsova and Gogarten, BMC Evol Biol 7, 45, 2007) and cyanobacteria (Zhaxybayeva et al., Genome Res 16, 1099108, 2006)."



Stefan Grünewald,
Katharina Huber and
Qiong Wu. Two novel closure rules for constructing phylogenetic supernetworks. In BMB, Vol. 70(7):19061924, 2008. Keywords: abstract network, from splits, from unrooted trees, phylogenetic network, phylogeny, Program MY CLOSURE, reconstruction, supernetwork. Note: http://arxiv.org/abs/0709.0283, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/huber/.
Toggle abstract
"A contemporary and fundamental problem faced by many evolutionary biologists is how to puzzle together a collection P of partial trees (leaflabeled trees whose leaves are bijectively labeled by species or, more generally, taxa, each supported by, e.g., a gene) into an overall parental structure that displays all trees in P. This already difficult problem is complicated by the fact that the trees in P regularly support conflicting phylogenetic relationships and are not on the same but only overlapping taxa sets. A desirable requirement on the sought after parental structure, therefore, is that it can accommodate the observed conflicts. Phylogenetic networks are a popular tool capable of doing precisely this. However, not much is known about how to construct such networks from partial trees, a notable exception being the Zclosure supernetwork approach, which is based on the Zclosure rule, and the Qimputation approach. Although attractive approaches, they both suffer from the fact that the generated networks tend to be multidimensional making it necessary to apply some kind of filter to reduce their complexity. To avoid having to resort to a filter, we follow a different line of attack in this paper and develop closure rules for generating circular phylogenetic networks which have the attractive property that they can be represented in the plane. In particular, we introduce the novel Y(closure) rule and show that this rule on its own or in combination with one of Meacham's closure rules (which we call the Mrule) has some very desirable theoretical properties. In addition, we present a case study based on Rivera et al. "ring of life" to explore the reconstructive power of the M and Yrule and also reanalyze an Arabidopsis thaliana data set. © 2008 Society for Mathematical Biology."







Maria S. Poptsova and
J. Peter Gogarten. The power of phylogenetic approaches to detect horizontally transferred genes. In BMCEB, Vol. 7(45), 2007. Keywords: evaluation, from rooted trees, lateral gene transfer, Program EEEP. Note: http://dx.doi.org/10.1186/14712148745.
Toggle abstract
"Background. Horizontal gene transfer plays an important role in evolution because it sometimes allows recipient lineages to adapt to new ecological niches. High genes transfer frequencies were inferred for prokaryotic and early eukaryotic evolution. Does horizontal gene transfer also impact phylogenetic reconstruction of the evolutionary history of genomes and organisms? The answer to this question depends at least in part on the actual gene transfer frequencies and on the ability to weed out transferred genes from further analyses. Are the detected transfers mainly false positives, or are they the tip of an iceberg of many transfer events most of which go undetected by current methods? Results. Phylogenetic detection methods appear to be the method of choice to infer gene transfers, especially for ancient transfers and those followed by orthologous replacement. Here we explore how well some of these methods perform using in silico transfers between the terminal branches of a gamma proteobacterial, genome based phylogeny. For the experiments performed here on average the AU test at a 5% significance level detects 90.3% of the transfers and 91% of the exchanges as significant. Using the RobinsonFoulds distance only 57.7% of the exchanges and 60% of the donations were identified as significant. Analyses using bipartition spectra appeared most successful in our test case. The power of detection was on average 97% using a 70% cutoff and 94.2% with 90% cutoff for identifying conflicting bipartitions, while the rate of false positives was below 4.2% and 2.1% for the two cutoffs, respectively. For all methods the detection rates improved when more intervening branches separated donor and recipient. Conclusion. Rates of detected transfers should not be mistaken for the actual transfer rates; most analyses of gene transfers remain anecdotal. The method and significance level to identify potential gene transfer events represent a tradeoff between the frequency of erroneous identification (false positives) and the power to detect actual transfer events. © 2007 Poptsova and Gogarten; licensee BioMed Central Ltd."





HansJürgen Bandelt and
Andreas W. M. Dress. Weak hierarchies associated with similarity measures: an additive clustering technique. In BMB, Vol. 51:113166, 1989. Keywords: abstract network, clustering, from distances, from trees, phylogenetic network, phylogeny, Program WeakHierarchies, reconstruction, weak hierarchy. Note: http://dx.doi.org/10.1007/BF02458841.
Toggle abstract
"A new and apparently rather useful and natural concept in cluster analysis is studied: given a similarity measure on a set of objects, a subset is regarded as a cluster if any two objects a, b inside this subset have greater similarity than any third object outside has to at least one of a, b. These clusters then form a closure system which can be described as a hypergraph without triangles. Conversely, given such a system, one may attach some weight to each cluster and then compose a similarity measure additively, by letting the similarity of a pair be the sum of weights of the clusters containing that particular pair. The original clusters can be reconstructed from the obtained similarity measure. This clustering model is thus located between the general additive clustering model of Shepard and Arabie (1979) and the standard hierarchical model. Potential applications include fitting dendrograms with few additional nonnested clusters and simultaneous representation of some families of multiple dendrograms (in particular, twodendrogram solutions), as well as assisting the search for phylogenetic relationships by proposing a somewhat larger system of possibly relevant "family groups", from which an appropriate choice (based on additional insight or individual preferences) remains to be made. © 1989 Society for Mathematical Biology."


