Related papers: Constructing Phylogenetic Networks via Cherry Picking and Machine Learning

Constructing Phylogenetic Networks via Cherry Picking and Machine Learning

URL: http://arxiv.org/abs/2304.02729v1
Date: Fri, 31 Mar 2023 15:04:42 GMT
Title: Constructing Phylogenetic Networks via Cherry Picking and Machine Learning
Authors: Giulia Bernardini and Leo van Iersel and Esther Julien and Leen Stougie
Abstract summary: Existing methods are computationally expensive and can either handle only small numbers of phylogenetic trees or are limited to severely restricted classes of networks. We apply the recently-introduced theoretical framework of cherry picking to design a class of efficients that are guaranteed to produce a network containing each of the input trees. We also propose simple and fast randomiseds that prove to be very effective when run multiple times.
Score: 0.1045050906735615
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Combining a set of phylogenetic trees into a single phylogenetic network that explains all of them is a fundamental challenge in evolutionary studies. Existing methods are computationally expensive and can either handle only small numbers of phylogenetic trees or are limited to severely restricted classes of networks. In this paper, we apply the recently-introduced theoretical framework of cherry picking to design a class of efficient heuristics that are guaranteed to produce a network containing each of the input trees, for datasets consisting of binary trees. Some of the heuristics in this framework are based on the design and training of a machine learning model that captures essential information on the structure of the input trees and guides the algorithms towards better solutions. We also propose simple and fast randomised heuristics that prove to be very effective when run multiple times. Unlike the existing exact methods, our heuristics are applicable to datasets of practical size, and the experimental study we conducted on both simulated and real data shows that these solutions are qualitatively good, always within some small constant factor from the optimum. Moreover, our machine-learned heuristics are one of the first applications of machine learning to phylogenetics and show its promise.

Related papers

FUTURE: Flexible Unlearning for Tree Ensemble [23.336396189756574]
Tree ensembles are widely recognized for their effectiveness in classification tasks, achieving state-of-the-art performance across diverse domains.<n>With increasing emphasis on data privacy and the textitright to be forgotten, several unlearning algorithms have been proposed to enable tree ensembles to forget sensitive information.<n>We propose FUTURE, a novel unlearning algorithm for tree ensembles.
arXiv Detail & Related papers (2025-08-28T19:45:36Z)
PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders [5.505257238864315]
PhyloVAE is an unsupervised learning framework designed for representation learning and generative modeling of tree topologies. We develop a deep latent-variable generative model that facilitates fast, parallelized topology generation. Experiments demonstrate PhyloVAE's robust representation learning capabilities and fast generation of phylogenetic tree topologies.
arXiv Detail & Related papers (2025-02-07T07:58:47Z)
A Closer Look at Deep Learning Methods on Tabular Datasets [78.61845513154502]
We present an extensive study on TALENT, a collection of 300+ datasets spanning broad ranges of size.<n>Our evaluation shows that ensembling benefits both tree-based and neural approaches.
arXiv Detail & Related papers (2024-07-01T04:24:07Z)
Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding [0.0]
We show that our split-weight embedded clustering is able to recover meaningful evolutionary relationships in simulated and real (Adansonia baobabs) data.
arXiv Detail & Related papers (2023-12-26T14:50:39Z)
PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference. Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z)
The tree reconstruction game: phylogenetic reconstruction using reinforcement learning [30.114112337828875]
We propose a reinforcement-learning algorithm to tackle the challenge of reconstructing phylogenetic trees. In this study, we demonstrate that reinforcement learning can be used to learn an optimal search strategy. Our results show that the likelihood scores of the inferred phylogenies are similar to those obtained from widely-used software.
arXiv Detail & Related papers (2023-03-12T16:19:06Z)
Learnable Topological Features for Phylogenetic Inference via Graph Neural Networks [7.310488568715925]
We propose a novel structural representation method for phylogenetic inference based on learnable topological features. By combining the raw node features that minimize the Dirichlet energy with modern graph representation learning techniques, our learnable topological features can provide efficient structural information of phylogenetic trees.
arXiv Detail & Related papers (2023-02-17T12:26:03Z)
Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain. In this work, we build models that rely on the message-passing rule of predictive coding. We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience. We show that sparse coding can effectively maximize the entropy of the output signals. Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.