Cécile Ané: From reconstructing to using phylogenetic networks, 2017-12-13 09:00 PST
I will first highlight why network reconstruction is worth the effort, and then explain some of the challenges of network reconstruction and network intepretation. These challenges include identifiability issues, difficulties to summarize network uncertainty, and interpretation issues related to network-thinking. Finally, I will describe new phylogenetic comparative methods that can be applied to phylogenetic networks, and are implemented in the PhyloNetworks Julia package.
Celine Scornavacca: Occam's razor in phylogenetic network reconstruction, 2017-11-22 09:00 PST
Several parsimony-based methods aiming at reconstructing explicit phylogenetic networks have been developed in the last two decades. In the first part of this talk I will review several of these methods that share the same underlying approach: First, combinatorial objects such as phylogenetic trees, hierarchical clusters or trinets are constructed from the data of the species under study; Second, these combinatorial objects are combined into an explicit phylogenetic network. The way they are combined and the parameters to optimise (e.g. minimising the hybridisation number, i.e. the number of reticulations of the network, or the level, i.e. the maximum number of reticulations in each biconnected component) give a large range of different problems, each of biological interest. In the second part of the talk I will discuss different definitions of maximum parsimony for phylogenetic networks, as well as the pros and cons of each of them. Then I will introduce several algorithmic results to lay the foundations for new parsimony-based methods for phylogenetic network reconstruction.
Luay Nakhleh: Phylogenetic Networks: From Displayed Trees to a Distribution of Gene Trees, 2017-10-27 09:00 PDT
<p>Phylogenetic networks are leaf-labeled, rooted, directed acyclic graphs that are used to represent and model reticulate, or non-treelike, evolutionary histories. Phylogenetic networks have received significant attention in the last two or three decades and the computational phylogenetics community has developed a wide array of mathematical results and algorithmic techniques for their inference. A fundamental observation that guided much of these developments was that a network is a summary of a set of trees. This observation gave rise to the parsimonious formulation of inferring a network with the smallest number of non-tree events that displays a given set of trees. </p>
<p>More recently, though, efforts have been dedicated to statistical inference of these networks from data of multiple, unlinked loci. This formulation is based on extending the multi-species coalescent to species phylogenies whose topologies are networks. With this extension, inferences simultaneously account for reticulation events, such as hybridization, in the presence of incomplete lineage sorting, thus not interpreting all heterogeneity in the data as caused solely by reticulation. </p>
In this seminar, I will introduce the phylogenetic network model, and give a brief survey of the results based on the parsimonious formulation. I will then introduce the multispecies network coalescent and describe recent results on statistical inference of phylogenetic networks from multi-locus data under this model.
Barbara Holland: Developing a statistically powerful measure for quartet tree inference using phylogenetic and Markov invariants, 2017-06-22 14:00 PDT
<p>Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants).</p>
<p>While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. By focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework.</p>
We present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference.
Laura Kubatko: Using invariants for coalescent-based phylogenetic inference, 2017-05-31 10:00 PDT
<p>The advent of rapid and inexpensive sequencing technologies has necessitated the development of computationally efficient methods for analyzing sequence data for many genes simultaneously in a phylogenetic framework. The coalescent process is the most commonly used model for linking the underlying genealogies of individual genes with the global species-level phylogeny, but inference under the coalescent model is computationally daunting in the typical inference frameworks (e.g., the likelihood and Bayesian frameworks) due to the dimensionality of the space of both gene trees and species trees. By viewing the data arising under the phylogenetic coalescent model as a collection of site patterns, the algebraic structure associated with the probability distribution on the site patterns can be used to develop computationally efficient methods for inference via phylogenetic invariants.</p>
In this talk, I will discuss three problems that can be addressed using invariants. First, I will describe how identifiability results for four-taxon species trees based on site pattern probabilities can be used to build a quartet-based inference algorithm for trees of arbitrary size. Second, methods for rooting phylogenetic species trees inferred under the coalescent model will be discussed. Finally, the use of invariants to detect species that arose via hybridization will be described. The methods presented will be demonstrated on several phylogenomic-scale datasets. Because the methods are derived in a fully model-based framework (i.e., the coalescent process is used to model the relationship between gene trees and the species tree, and standard nucleotide substitution models (GTR+I+G and all submodels) are used for sequence-level evolution), these methods are promising approaches for computationally efficient, model-based inference for the large-scale sequence data available today.