Related papers: PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

URL: http://arxiv.org/abs/2502.04730v1
Date: Fri, 07 Feb 2025 07:58:47 GMT
Title: PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders
Authors: Tianyu Xie, Harry Richman, Jiansi Gao, Frederick A. Matsen IV, Cheng Zhang,
Abstract summary: PhyloVAE is an unsupervised learning framework designed for representation learning and generative modeling of tree topologies.<n>We develop a deep latent-variable generative model that facilitates fast, parallelized topology generation.<n>Experiments demonstrate PhyloVAE's robust representation learning capabilities and fast generation of phylogenetic tree topologies.
Score: 5.505257238864315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning informative representations of phylogenetic tree structures is essential for analyzing evolutionary relationships. Classical distance-based methods have been widely used to project phylogenetic trees into Euclidean space, but they are often sensitive to the choice of distance metric and may lack sufficient resolution. In this paper, we introduce phylogenetic variational autoencoders (PhyloVAEs), an unsupervised learning framework designed for representation learning and generative modeling of tree topologies. Leveraging an efficient encoding mechanism inspired by autoregressive tree topology generation, we develop a deep latent-variable generative model that facilitates fast, parallelized topology generation. PhyloVAE combines this generative model with a collaborative inference model based on learnable topological features, allowing for high-resolution representations of phylogenetic tree samples. Extensive experiments demonstrate PhyloVAE's robust representation learning capabilities and fast generation of phylogenetic tree topologies.

Related papers

Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters. Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks. It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z)
PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation [50.80441546742053]
Phylogenetic trees elucidate evolutionary relationships among species.<n>Traditional Markov Chain Monte Carlo methods face slow convergence and computational burdens.<n>We propose PhyloGen, a novel method leveraging a pre-trained genomic language model.
arXiv Detail & Related papers (2024-12-25T08:33:05Z)
Tree-based variational inference for Poisson log-normal models [47.82745603191512]
hierarchical trees are often used to organize entities based on proximity criteria.<n>Current count-data models do not leverage this structured information.<n>We introduce the PLN-Tree model as an extension of the PLN model for modeling hierarchical count data.
arXiv Detail & Related papers (2024-06-25T08:24:35Z)
Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.<n>We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.<n>We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z)
Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields [56.69755544814834]
We present evidence suggesting that depthwise convolutional kernels are effectively replicating the biological receptive fields observed in the mammalian retina. We propose a scheme that draws inspiration from the biological receptive fields.
arXiv Detail & Related papers (2024-01-18T18:06:22Z)
ARTree: A Deep Autoregressive Model for Phylogenetic Inference [6.935130578959931]
We propose a deep autoregressive model for phylogenetic inference based on graph neural networks (GNNs) We demonstrate the effectiveness and efficiency of our method on a benchmark of challenging real data tree topology density estimation and variational phylogenetic inference problems.
arXiv Detail & Related papers (2023-10-14T10:26:03Z)
PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference. Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z)
GeoPhy: Differentiable Phylogenetic Inference via Geometric Gradients of Tree Topologies [0.3263412255491401]
We introduce a novel, fully differentiable formulation of phylogenetic inference that leverages a unique representation of topological distributions in continuous geometric spaces. In experiments using real benchmark datasets, GeoPhy significantly outperformed other approximate Bayesian methods that considered whole topologies.
arXiv Detail & Related papers (2023-07-07T15:45:05Z)
Learnable Topological Features for Phylogenetic Inference via Graph Neural Networks [7.310488568715925]
We propose a novel structural representation method for phylogenetic inference based on learnable topological features. By combining the raw node features that minimize the Dirichlet energy with modern graph representation learning techniques, our learnable topological features can provide efficient structural information of phylogenetic trees.
arXiv Detail & Related papers (2023-02-17T12:26:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.