






Magnus Bordewich,
Charles Semple and
Nihan Tokac. Constructing treechild networks from distance matrices. In Algorithmica, 2017. Keywords: compressed network, explicit network, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, tree child network, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSN17.pdf, to appear.











Paul Cordue,
Simone Linz and
Charles Semple. Phylogenetic Networks that Display a Tree Twice. In BMB, Vol. 76(10):26642679, 2014. Keywords: from rooted trees, normal network, phylogenetic network, phylogeny, reconstruction, tree child network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/CLS14.pdf.
Toggle abstract
"In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the treelike information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratictime algorithm to solve this problem for a class of networks that is more general than treechild networks. © 2014, Society for Mathematical Biology."



Peter J. Humphries,
Simone Linz and
Charles Semple. On the complexity of computing the temporal hybridization number for two phylogenies. In DAM, Vol. 161:871880, 2013. Keywords: agreement forest, APX hard, characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/TAFapx.pdf.
Toggle abstract
"Phylogenetic networks are now frequently used to explain the evolutionary history of a set of species for which a collection of gene trees, reconstructed from genetic material of different parts of the species' genomes, reveal inconsistencies. However, in the context of hybridization, the reconstructed networks are often not temporal. If a hybridization network is temporal, then it satisfies the time constraint of instantaneously occurring hybridization events; i.e. all species that are involved in such an event coexist in time. Furthermore, although a collection of phylogenetic trees can often be merged into a hybridization network that is temporal, many algorithms do not necessarily find such a network since their primary optimization objective is to minimize the number of hybridization events. In this paper, we present a characterization for when two rooted binary phylogenetic trees admit a temporal hybridization network. Furthermore, we show that the underlying optimization problem is APXhard and, therefore, NPhard. Thus, unless P=NP, it is unlikely that there are efficient algorithms for either computing an exact solution or approximating it within a ratio arbitrarily close to one. © 2012 Elsevier B.V. All rights reserved."





Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):18791890, 2013. Keywords: characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.unituebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimumtemporalhybridization number for a set P of rooted binary phylogenetic trees is NPhard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimumtemporal hybridization number for two trees is fixedparameter tractable. © 2013 Society for Mathematical Biology."



Magnus Bordewich and
Charles Semple. Budgeted Nature Reserve Selection with diversity feature loss and arbitrary split systems. In JOMB, Vol. 64(1):6985, 2012. Keywords: abstract network, approximation, diversity, phylogenetic network, polynomial, split network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS11.pdf.
Toggle abstract
"Arising in the context of biodiversity conservation, the Budgeted Nature Reserve Selection (BNRS) problem is to select, subject to budgetary constraints, a set of regions to conserve so that the phylogenetic diversity (PD) of the set of species contained within those regions is maximized. Here PD is measured across either a single rooted tree or a single unrooted tree. Nevertheless, in both settings, this problem is NPhard. However, it was recently shown that, for each setting, there is a polynomialtime (11/e)approximation algorithm for it and that this algorithm is tight. In the first part of the paper, we consider two extensions of BNRS. In the rooted setting we additionally allow for the disappearance of features, for varying survival probabilities across species, and for PD to be measured across multiple trees. In the unrooted setting, we extend to arbitrary split systems. We show that, despite these additional allowances, there remains a polynomialtime (11/e)approximation algorithm for each extension. In the second part of the paper, we resolve a complexity problem on computing PD across an arbitrary split system left open by Spillner et al. © 2011 SpringerVerlag."



Josh Voorkamp né Collins,
Simone Linz and
Charles Semple. Quantifying hybridization in realistic time. In JCB, Vol. 18(10):13051318, 2011. Keywords: explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, software. Note: http://wwwcsif.cs.ucdavis.edu/~linzs/CLS10_interleave.pdf, software available at http://www.math.canterbury.ac.nz/~c.semple/software.shtml.
Toggle abstract
"Recently, numerous practical and theoretical studies in evolutionary biology aim at calculating the extent to which reticulationfor example, horizontal gene transfer, hybridization, or recombinationhas influenced the evolution for a set of presentday species. It has been shown that inferring the minimum number of hybridization events that is needed to simultaneously explain the evolutionary history for a set of trees is an NPhard and also fixedparameter tractable problem. In this article, we give a new fixedparameter algorithm for computing the minimum number of hybridization events for when two rooted binary phylogenetic trees are given. This newly developed algorithm is based on interleavinga technique using repeated kernelization steps that are applied throughout the exhaustive search part of a fixedparameter algorithm. To show that our algorithm runs efficiently to be applicable to a wide range of practical problem instances, we apply it to a grass data set and highlight the significant improvements in terms of running times in comparison to an algorithm that has previously been implemented. © 2011, Mary Ann Liebert, Inc."



Leo van Iersel,
Charles Semple and
Mike Steel. Quantifying the Extent of Lateral Gene Transfer Required to Avert a 'Genome of Eden'. In BMB, Vol. 72:1783–1798, 2010. Note: http://www.win.tue.nl/~liersel/LGT.pdf.
Toggle abstract
"The complex pattern of presence and absence of many genes across different species provides tantalising clues as to how genes evolved through the processes of gene genesis, gene loss, and lateral gene transfer (LGT). The extent of LGT, particularly in prokaryotes, and its implications for creating a 'network of life' rather than a 'tree of life' is controversial. In this paper, we formally model the problem of quantifying LGT, and provide exact mathematical bounds, and new computational results. In particular, we investigate the computational complexity of quantifying the extent of LGT under the simple models of gene genesis, loss, and transfer on which a recent heuristic analysis of biological data relied. Our approach takes advantage of a relationship between LGT optimization and graphtheoretical concepts such as tree width and network flow. © 2010 Society for Mathematical Biology."



Simone Linz,
Charles Semple and
Tanja Stadler. Analyzing and reconstructing reticulation networks under timing constraints. In JOMB, Vol. 61(5):715737, 2010. Keywords: explicit network, from rooted trees, hybridization, lateral gene transfer, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://dx.doi.org/10.1007/s002850090319y..
Toggle abstract
"Reticulation networks are now frequently used to model the history of life for various groups of species whose evolutionary past is likely to include reticulation events such as horizontal gene transfer or hybridization. However, the reconstructed networks are rarely guaranteed to be temporal. If a reticulation network is temporal, then it satisfies the two biologically motivated timing constraints of instantaneously occurring reticulation events and successively occurring speciation events. On the other hand, if a reticulation network is not temporal, it is always possible to make it temporal by adding a number of additional unsampled or extinct taxa. In the first half of the paper, we show that deciding whether a given number of additional taxa is sufficient to transform a nontemporal reticulation network into a temporal one is an NPcomplete problem. As one is often given a set of gene trees instead of a network in the context of hybridization, this motivates the second half of the paper which provides an algorithm, called TemporalHybrid, for reconstructing a temporal hybridization network that simultaneously explains the ancestral history of two trees or indicates that no such network exists. We further derive two methods to decide whether or not a temporal hybridization network exists for two given trees and illustrate one of the methods on a grass data set. © 2009 The Author(s)."



Leo van Iersel,
Charles Semple and
Mike Steel. Locating a tree in a phylogenetic network. In IPL, Vol. 110(23), 2010. Keywords: cluster containment, explicit network, from network, level k phylogenetic network, normal network, NP complete, phylogenetic network, polynomial, regular network, time consistent network, tree child network, tree containment, tree sibling network. Note: http://arxiv.org/abs/1006.3122.
Toggle abstract
"Phylogenetic trees and networks are leaflabelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NPcomplete in general. In this article, we consider the restriction of these problems to several wellstudied classes of phylogenetic networks. We show that Tree Containment is polynomialtime solvable for normal networks, for binary treechild networks, and for levelk networks. On the other hand, we show that, even for treesibling, timeconsistent, regular networks, both Tree Containment and Cluster Containment remain NPcomplete. © 2010 Elsevier B.V. All rights reserved."





Stefan Grünewald,
Katharina Huber,
Vincent Moulton,
Charles Semple and
Andreas Spillner. Characterizing weak compatibility in terms of weighted quartets. In Advances in Applied Mathematics, Vol. 42(3):329341, 2009. Keywords: abstract network, characterization, from quartets, split network, weak hierarchy. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/GHMSS08.pdf, slides at http://www.lirmm.fr/miep08/slides/12_02_huber.pdf.







Magnus Bordewich and
Charles Semple. Computing the minimum number of hybridization events for a consistent evolutionary history. In DAM, Vol. 155:914918, 2007. Keywords: agreement forest, approximation, APX hard, explicit network, from rooted trees, hybridization, inapproximability, NP complete, phylogenetic network, phylogeny, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS06a.pdf.



Magnus Bordewich,
Simone Linz,
Katherine St. John and
Charles Semple. A reduction algorithm for computing the hybridization number of two trees. In EBIO, Vol. 3:8698, 2007. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNumber. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BLSS07.pdf.



Mihaela Baroni,
Charles Semple and
Mike Steel. Hybrids in Real Time. In Systematic Biology, Vol. 55(1):4656, 2006. Keywords: agreement forest, from rooted trees, phylogenetic network, phylogeny, polynomial, reconstruction, time consistent network. Note: http://www.math.canterbury.ac.nz/~m.steel/Non_UC/files/research/hybrids.pdf.
Toggle abstract
"We describe some new and recent results that allow for the analysis and representation of reticulate evolution by nontree networks. In particular, we (1) present a simple result to show that, despite the presence of reticulation, there is always a welldefined underlying tree that corresponds to those parts of life that do not have a history of reticulation; (2) describe and apply new theory for determining the smallest number of hybridization events required to explain conflicting gene trees; and (3) present a new algorithm to determine whether an arbitrary rooted network can be realized by contemporaneous reticulation events. We illustrate these results with examples. Copyright © Society of Systematic Biologists."





Magnus Bordewich and
Charles Semple. On the computational complexity of the rooted subtree prune and regraft distance. In ACOM, Vol. 8:409423, 2005. Keywords: agreement forest, from rooted trees, NP complete, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS04.pdf.
Toggle abstract
"The graphtheoretic operation of rooted subtree prune and regraft is increasingly being used as a tool for understanding and modelling reticulation events in evolutionary biology. In this paper, we show that computing the rooted subtree prune and regraft distance between two rooted binary phylogenetic trees on the same label set is NPhard. This resolves a longstanding open problem. Furthermore, we show that this distance is fixed parameter tractable when parameterised by the distance between the two trees."



Mihaela Baroni,
Stefan Grünewald,
Vincent Moulton and
Charles Semple. Bounding the number of hybridization events for a consistent evolutionary history. In JOMB, Vol. 51(2):171182, 2005. Keywords: agreement forest, bound, explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BGMS05.pdf.
Toggle abstract
"Evolutionary processes such as hybridisation, lateral gene transfer, and recombination are all key factors in shaping the structure of genes and genomes. However, since such processes are not always best represented by trees, there is now considerable interest in using more general networks instead. For example, in recent studies it has been shown that networks can be used to provide lower bounds on the number of recombination events and also for the number of lateral gene transfers that took place in the evolutionary history of a set of molecular sequences. In this paper we describe the theoretical performance of some related bounds that result when merging pairs of trees into networks. © SpringerVerlag 2005."



Mihaela Baroni,
Charles Semple and
Mike Steel. A framework for representing reticulate evolution. In ACOM, Vol. 8:398401, 2004. Keywords: explicit network, from clusters, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, regular network, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSS04.pdf.
Toggle abstract
"Acyclic directed graphs (ADGs) are increasingly being viewed as more appropriate for representing certain evolutionary relationships, particularly in biology, than rooted trees. In this paper, we develop a framework for the analysis of these graphs which we call hybrid phylogenies. We are particularly interested in the problem whereby one is given a set of phylogenetic trees and wishes to determine a hybrid phylogeny that 'embeds' each of these trees and which requires the smallest number of hybridisation events. We show that this quantity can be greatly reduced if additional species are involved, and investigate other combinatorial aspects of this and related questions."



Katharina Huber,
Vincent Moulton and
Charles Semple. Replacing cliques by stars in quasimedian graphs. In DAM, Vol. 143(13), 2004. Note: http://dx.doi.org/10.1016/j.dam.2004.03.002.
Toggle abstract
"For a multiset Σ of splits (bipartitions) of a finite set X, we introduce the multisplit graph G(Σ). This graph is a natural extension of the Buneman graph, Indeed, it is shown that several results pertaining to the Buneman graph extend to the multisplit graph. In addition, in case Σ is derived from a set ℛ of partitions of X by taking parts together with their complements, we show that the extremal instances where ℛ is either strongly compatible or strongly incompatible are equivalent to G(Σ) being either a tree or a Cartesian product of star trees, respectively. © 2004 Elsevier B.V. All rights reserved."



