Constructing Phylogenetic Networks via Cherry Picking and Machine
Learning
- URL: http://arxiv.org/abs/2304.02729v1
- Date: Fri, 31 Mar 2023 15:04:42 GMT
- Title: Constructing Phylogenetic Networks via Cherry Picking and Machine
Learning
- Authors: Giulia Bernardini and Leo van Iersel and Esther Julien and Leen
Stougie
- Abstract summary: Existing methods are computationally expensive and can either handle only small numbers of phylogenetic trees or are limited to severely restricted classes of networks.
We apply the recently-introduced theoretical framework of cherry picking to design a class of efficients that are guaranteed to produce a network containing each of the input trees.
We also propose simple and fast randomiseds that prove to be very effective when run multiple times.
- Score: 0.1045050906735615
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Combining a set of phylogenetic trees into a single phylogenetic network that
explains all of them is a fundamental challenge in evolutionary studies.
Existing methods are computationally expensive and can either handle only small
numbers of phylogenetic trees or are limited to severely restricted classes of
networks. In this paper, we apply the recently-introduced theoretical framework
of cherry picking to design a class of efficient heuristics that are guaranteed
to produce a network containing each of the input trees, for datasets
consisting of binary trees. Some of the heuristics in this framework are based
on the design and training of a machine learning model that captures essential
information on the structure of the input trees and guides the algorithms
towards better solutions. We also propose simple and fast randomised heuristics
that prove to be very effective when run multiple times.
Unlike the existing exact methods, our heuristics are applicable to datasets
of practical size, and the experimental study we conducted on both simulated
and real data shows that these solutions are qualitatively good, always within
some small constant factor from the optimum. Moreover, our machine-learned
heuristics are one of the first applications of machine learning to
phylogenetics and show its promise.
Related papers
- Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding [0.0]
We show that our split-weight embedded clustering is able to recover meaningful evolutionary relationships in simulated and real (Adansonia baobabs) data.
arXiv Detail & Related papers (2023-12-26T14:50:39Z) - PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference.
Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies.
We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z) - The tree reconstruction game: phylogenetic reconstruction using
reinforcement learning [30.114112337828875]
We propose a reinforcement-learning algorithm to tackle the challenge of reconstructing phylogenetic trees.
In this study, we demonstrate that reinforcement learning can be used to learn an optimal search strategy.
Our results show that the likelihood scores of the inferred phylogenies are similar to those obtained from widely-used software.
arXiv Detail & Related papers (2023-03-12T16:19:06Z) - Learnable Topological Features for Phylogenetic Inference via Graph
Neural Networks [7.310488568715925]
We propose a novel structural representation method for phylogenetic inference based on learnable topological features.
By combining the raw node features that minimize the Dirichlet energy with modern graph representation learning techniques, our learnable topological features can provide efficient structural information of phylogenetic trees.
arXiv Detail & Related papers (2023-02-17T12:26:03Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.