Tropical Data Science
- URL: http://arxiv.org/abs/2005.06586v1
- Date: Wed, 13 May 2020 21:03:41 GMT
- Title: Tropical Data Science
- Authors: Ruriko Yoshida
- Abstract summary: Phylogenomics is a new field which applies to tools in phylogenetics to genome data.
Due to a new technology and increasing amount of data, we face new challenges to analyze them over a space of phylogenetic trees.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Phylogenomics is a new field which applies to tools in phylogenetics to
genome data. Due to a new technology and increasing amount of data, we face new
challenges to analyze them over a space of phylogenetic trees. Because a space
of phylogenetic trees with a fixed set of labels on leaves is not Euclidean, we
cannot simply apply tools in data science. In this paper we survey some new
developments of machine learning models using tropical geometry to analyze a
set of phylogenetic trees over a tree space.
Related papers
- VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling [60.91599380893732]
VQDNA is a general-purpose framework that renovates genome tokenization from the perspective of genome vocabulary learning.
By leveraging vector-quantized codebooks as learnable vocabulary, VQDNA can adaptively tokenize genomes into pattern-aware embeddings.
arXiv Detail & Related papers (2024-05-13T20:15:03Z) - Information-Theoretic Thresholds for Planted Dense Cycles [52.076657911275525]
We study a random graph model for small-world networks which are ubiquitous in social and biological sciences.
For both detection and recovery of the planted dense cycle, we characterize the information-theoretic thresholds in terms of $n$, $tau$, and an edge-wise signal-to-noise ratio $lambda$.
arXiv Detail & Related papers (2024-02-01T03:39:01Z) - Unsupervised Learning of Phylogenetic Trees via Split-Weight Embedding [0.0]
We show that our split-weight embedded clustering is able to recover meaningful evolutionary relationships in simulated and real (Adansonia baobabs) data.
arXiv Detail & Related papers (2023-12-26T14:50:39Z) - PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference.
Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies.
We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z) - GeoPhy: Differentiable Phylogenetic Inference via Geometric Gradients of
Tree Topologies [0.3263412255491401]
We introduce a novel, fully differentiable formulation of phylogenetic inference that leverages a unique representation of topological distributions in continuous geometric spaces.
In experiments using real benchmark datasets, GeoPhy significantly outperformed other approximate Bayesian methods that considered whole topologies.
arXiv Detail & Related papers (2023-07-07T15:45:05Z) - Symmetry-Informed Geometric Representation for Molecules, Proteins, and
Crystalline Materials [66.14337835284628]
We propose a platform, coined Geom3D, which enables benchmarking the effectiveness of geometric strategies.
Geom3D contains 16 advanced symmetry-informed geometric representation models and 14 geometric pretraining methods over 46 diverse datasets.
arXiv Detail & Related papers (2023-06-15T05:37:25Z) - Leaping through tree space: continuous phylogenetic inference for rooted
and unrooted trees [0.49478969093606673]
We perform both tree exploration and inference in a continuous space where the computation of gradients is possible.
This continuous relaxation allows for major leaps across tree space in both rooted and unrooted trees, and is less susceptible to convergence to local minima.
Our approach outperforms the current best methods for inference on unrooted trees and, in simulation, accurately infers the tree and root in ultrametric cases.
arXiv Detail & Related papers (2023-06-09T08:13:06Z) - Learnable Topological Features for Phylogenetic Inference via Graph
Neural Networks [7.310488568715925]
We propose a novel structural representation method for phylogenetic inference based on learnable topological features.
By combining the raw node features that minimize the Dirichlet energy with modern graph representation learning techniques, our learnable topological features can provide efficient structural information of phylogenetic trees.
arXiv Detail & Related papers (2023-02-17T12:26:03Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Uncovering the Folding Landscape of RNA Secondary Structure with Deep
Graph Embeddings [71.20283285671461]
We propose a geometric scattering autoencoder (GSAE) network for learning such graph embeddings.
Our embedding network first extracts rich graph features using the recently proposed geometric scattering transform.
We show that GSAE organizes RNA graphs both by structure and energy, accurately reflecting bistable RNA structures.
arXiv Detail & Related papers (2020-06-12T00:17:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.