
HansJürgen Bandelt and
Andreas W. M. Dress. A canonical decomposition theory for metrics on a finite set. In Advances in Mathematics, Vol. 92(1):47105, 1992. Keywords: abstract network, circular split system, from distances, split, split decomposition, split network, weak hierarchy, weakly compatible.
Toggle abstract
"We consider specific additive decompositions d = d1 + ... + dn of metrics, defined on a finite set X (where a metric may give distance zero to pairs of distinct points). The simplest building stones are the slit metrics, associated to splits (i.e., bipartitions) of the given set X. While an additive decomposition of a Hamming metric into split metrics is in no way unique, we achieve uniqueness by restricting ourselves to coherent decompositions, that is, decompositions d = d1 + ... + dn such that for every map f:X → R with f(x) + f(y) ≥ d(x, y) for all x, y ε{lunate} X there exist maps f1, ..., fn: X → R with f = f1 + ... + fn and fi(x) + fi(y) ≥ di(x, y) for all i = 1,..., n and all x, y ε{lunate} X. These coherent decompositions are closely related to a geometric decomposition of the injective hull of the given metric. A metric with a coherent decomposition into a (weighted) sum of split metrics will be called totally splitdecomposable. Tree metrics (and more generally, the sum of two tree metrics) are particular instances of totally splitdecomposable metrics. Our main result confirms that every metric admits a coherent decomposition into a totally splitdecomposable metric and a splitprime residue, where all the split summands and hence the decomposition can be determined in polynomial time, and that a family of splits can occur this way if and only if it does not induce on any fourpoint subset all three splits with block size two. © 1992."







Jotun Hein. Reconstructing evolution of sequences subject to recombination using parsimony. In MBIO, Vol. 98(2):185200, 1990. Note: http://dx.doi.org/10.1016/00255564(90)90123G.
Toggle abstract
"The parsimony principle states that a history of a set of sequences that minimizes the amount of evolution is a good approximation to the real evolutionary history of the sequences. This principle is applied to the reconstruction of the evolution of homologous sequences where recombinations or horizontal transfer can occur. First it is demonstrated that the appropriate structure to represent the evolution of sequences with recombinations is a family of trees each describing the evolution of a segment of the sequence. Two trees for neighboring segments will differ by exactly the transfer of a subtree within the whole tree. This leads to a metric between trees based on the smallest number of such operations needed to convert one tree into the other. An algorithm is presented that calculates this metric. This metric is used to formulate a dynamic programming algorithm that finds the most parsimonious history that fits a given set of sequences. The algorithm is potentially very practical, since many groups of sequences defy analysis by methods that ignore recombinations. These methods give ambiguous or contradictory results because the sequence history cannot be described by one phylogeny, but only a family of phylogenies that each describe the history of a segment of the sequences. The generalization of the algorithm to reconstruct gene conversions and the possibility for heuristic versions of the algorithm for larger data sets are discussed. © 1990."





HansJürgen Bandelt and
Andreas W. M. Dress. Weak hierarchies associated with similarity measures: an additive clustering technique. In BMB, Vol. 51:113166, 1989. Keywords: abstract network, clustering, from distances, from trees, phylogenetic network, phylogeny, Program WeakHierarchies, reconstruction, weak hierarchy. Note: http://dx.doi.org/10.1007/BF02458841.
Toggle abstract
"A new and apparently rather useful and natural concept in cluster analysis is studied: given a similarity measure on a set of objects, a subset is regarded as a cluster if any two objects a, b inside this subset have greater similarity than any third object outside has to at least one of a, b. These clusters then form a closure system which can be described as a hypergraph without triangles. Conversely, given such a system, one may attach some weight to each cluster and then compose a similarity measure additively, by letting the similarity of a pair be the sum of weights of the clusters containing that particular pair. The original clusters can be reconstructed from the obtained similarity measure. This clustering model is thus located between the general additive clustering model of Shepard and Arabie (1979) and the standard hierarchical model. Potential applications include fitting dendrograms with few additional nonnested clusters and simultaneous representation of some families of multiple dendrograms (in particular, twodendrogram solutions), as well as assisting the search for phylogenetic relationships by proposing a somewhat larger system of possibly relevant "family groups", from which an appropriate choice (based on additional insight or individual preferences) remains to be made. © 1989 Society for Mathematical Biology."





Alain Guénoche. Graphical Representation of a Boolean Array. In Computers and the Humanities, Vol. 20(4):277281, 1986. Keywords: from splits, median network, reconstruction. Note: http://dx.doi.org/10.1007/BF02400118.
Toggle abstract
"In this paper, we represent a boolean array of data with a median connected graph. Vertices are the different lines of the array plus virtual monomials, and an edge links two vertices that are different for only one variable. We describe an algorithm to compute this graph, that is an exact representation of the symmetrical difference distance between lines, and we show an application to Bronze age pins. © 1986 Paradigm Press, Inc."



Ingo Althöfer. On optimal realizations of finite metric spaces by graphs. In Discrete and Computational Geometry, Vol. 3(1):103122, 1986. Keywords: NP complete, optimal realization, realization. Note: http://dx.doi.org/10.1007/BF02187901.
Toggle abstract
"Graph realizations of finite metric spaces have widespread applications, for example, in biology, economics, and information theory. The main results of this paper are: 1. Finding optimal realizations of integral metrics (which means all distances are integral) is NPcomplete. 2. There exist metric spaces with a continuum of optimal realizations. Furthermore, two conditions necessary for a weighted graph to be an optimal realization are given and an extremal problem arising in connection with the realization problem is investigated. © 1988 SpringerVerlag New York Inc."

















Richard R. Hudson. Properties of the neutral allele model with intragenic recombination. In TPP, Vol. 23:183201, 1983. Keywords: coalescent. Note: http://dx.doi.org/10.1016/00405809(83)900138, see also http://www.brics.dk/~compbio/coalescent/hudson_animator.html.
Toggle abstract
"An infinitesite neutral allele model with crossingover possible at any of an infinite number of sites is studied. A formula for the variance of the number of segregating sites in a sample of gametes is obtained. An approximate expression for the expected homozygosity is also derived. Simulation results are presented to indicate the accuracy of the approximations. The results concerning the number of segregating sites and the expected homozygosity indicate that a twolocus model and the infinitesite model behave similarly for 4Nu ≤ 2 and r ≤ 5u, where N is the population size, u is the neutral mutation rate, and r is the recombination rate. Simulations of a twolocus model and a fourlocus model were also carried out to determine the effect of intragenic recombination on the homozygosity test ofWatterson (Genetics 85, 789814; 88, 405417) and on the number of unique alleles in a sample. The results indicate that for 4Nu ≤ 2 and r ≤ 10u, the effect of recombination is quite small. © 1983."








