Leaping through tree space: continuous phylogenetic inference for rooted
and unrooted trees
- URL: http://arxiv.org/abs/2306.05739v4
- Date: Tue, 23 Jan 2024 09:05:19 GMT
- Title: Leaping through tree space: continuous phylogenetic inference for rooted
and unrooted trees
- Authors: Matthew J Penn, Neil Scheidwasser, Joseph Penn, Christl A Donnelly,
David A Duch\^ene, and Samir Bhatt
- Abstract summary: We perform both tree exploration and inference in a continuous space where the computation of gradients is possible.
This continuous relaxation allows for major leaps across tree space in both rooted and unrooted trees, and is less susceptible to convergence to local minima.
Our approach outperforms the current best methods for inference on unrooted trees and, in simulation, accurately infers the tree and root in ultrametric cases.
- Score: 0.49478969093606673
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Phylogenetics is now fundamental in life sciences, providing insights into
the earliest branches of life and the origins and spread of epidemics. However,
finding suitable phylogenies from the vast space of possible trees remains
challenging. To address this problem, for the first time, we perform both tree
exploration and inference in a continuous space where the computation of
gradients is possible. This continuous relaxation allows for major leaps across
tree space in both rooted and unrooted trees, and is less susceptible to
convergence to local minima. Our approach outperforms the current best methods
for inference on unrooted trees and, in simulation, accurately infers the tree
and root in ultrametric cases. The approach is effective in cases of empirical
data with negligible amounts of data, which we demonstrate on the phylogeny of
jawed vertebrates. Indeed, only a few genes with an ultrametric signal were
generally sufficient for resolving the major lineages of vertebrates.
Optimisation is possible via automatic differentiation and our method presents
an effective way forwards for exploring the most difficult, data-deficient
phylogenetic questions.
Related papers
- Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding [0.0]
We show that our split-weight embedded clustering is able to recover meaningful evolutionary relationships in simulated and real (Adansonia baobabs) data.
arXiv Detail & Related papers (2023-12-26T14:50:39Z) - ARTree: A Deep Autoregressive Model for Phylogenetic Inference [6.935130578959931]
We propose a deep autoregressive model for phylogenetic inference based on graph neural networks (GNNs)
We demonstrate the effectiveness and efficiency of our method on a benchmark of challenging real data tree topology density estimation and variational phylogenetic inference problems.
arXiv Detail & Related papers (2023-10-14T10:26:03Z) - PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference.
Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies.
We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z) - GeoPhy: Differentiable Phylogenetic Inference via Geometric Gradients of
Tree Topologies [0.3263412255491401]
We introduce a novel, fully differentiable formulation of phylogenetic inference that leverages a unique representation of topological distributions in continuous geometric spaces.
In experiments using real benchmark datasets, GeoPhy significantly outperformed other approximate Bayesian methods that considered whole topologies.
arXiv Detail & Related papers (2023-07-07T15:45:05Z) - Social Interpretable Tree for Pedestrian Trajectory Prediction [75.81745697967608]
We propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task.
A path in the tree from the root to leaf represents an individual possible future trajectory.
Despite the hand-crafted tree, the experimental results on ETH-UCY and Stanford Drone datasets demonstrate that our method is capable of matching or exceeding the performance of state-of-the-art methods.
arXiv Detail & Related papers (2022-05-26T12:18:44Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Tropical Data Science [0.0]
Phylogenomics is a new field which applies to tools in phylogenetics to genome data.
Due to a new technology and increasing amount of data, we face new challenges to analyze them over a space of phylogenetic trees.
arXiv Detail & Related papers (2020-05-13T21:03:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.