Phylotrack: C++ and Python libraries for in silico phylogenetic tracking
- URL: http://arxiv.org/abs/2405.09389v2
- Date: Tue, 16 Jul 2024 22:40:03 GMT
- Title: Phylotrack: C++ and Python libraries for in silico phylogenetic tracking
- Authors: Emily Dolson, Santiago Rodriguez-Papa, Matthew Andres Moreno,
- Abstract summary: The Phylotrack project provides libraries for tracking and analyzing phylogenies in in silico evolution.
The project is composed of 1) Phylotracklib: a header-only C++ library, developed under the umbrella of the Empirical project, and 2) Phylotrackpy: a Python wrapper around Phylotracklib, created with Pybind11.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In silico evolution instantiates the processes of heredity, variation, and differential reproductive success (the three "ingredients" for evolution by natural selection) within digital populations of computational agents. Consequently, these populations undergo evolution, and can be used as virtual model systems for studying evolutionary dynamics. This experimental paradigm -- used across biological modeling, artificial life, and evolutionary computation -- complements research done using in vitro and in vivo systems by enabling experiments that would be impossible in the lab or field. One key benefit is complete, exact observability. For example, it is possible to perfectly record all parent-child relationships across simulation history, yielding complete phylogenies (ancestry trees). This information reveals when traits were gained or lost, and also facilitates inference of underlying evolutionary dynamics. The Phylotrack project provides libraries for tracking and analyzing phylogenies in in silico evolution. The project is composed of 1) Phylotracklib: a header-only C++ library, developed under the umbrella of the Empirical project, and 2) Phylotrackpy: a Python wrapper around Phylotracklib, created with Pybind11. Both components supply a public-facing API to attach phylogenetic tracking to digital evolution systems, as well as a stand-alone interface for measuring a variety of popular phylogenetic topology metrics. Underlying design and C++ implementation prioritizes efficiency, allowing for fast generational turnover for agent populations numbering in the tens of thousands. Several explicit features (e.g., phylogeny pruning and abstraction, etc.) are provided for reducing the memory footprint of phylogenetic information.
Related papers
- Nonparametric independence tests in high-dimensional settings, with applications to the genetics of complex disease [55.2480439325792]
We show how defining adequate premetric structures on the support spaces of the genetic data allows for novel approaches to such testing.
For each problem, we provide mathematical results, simulations and the application to real data.
arXiv Detail & Related papers (2024-07-29T01:00:53Z) - A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models [0.0]
In silico work with agent-based models provides an opportunity to collect high-quality records of ancestry relationships among simulated agents.
Existing work generally tracks lineages directly, yielding an exact phylogenetic record of evolutionary history.
Post hoc estimation is akin to how bioinformaticians build phylogenies by assessing genetic similarities between organisms.
arXiv Detail & Related papers (2024-05-16T15:27:51Z) - Trackable Island-model Genetic Algorithms at Wafer Scale [0.0]
We present a tracking-enabled asynchronous island-based genetic algorithm (GA) framework for Cerebras Wafer-Scale Engine (WSE) hardware.
We validate phylogenetic reconstructions and demonstrate their suitability for inference of underlying evolutionary conditions.
These benchmark and validation trials reflect strong potential for highly scalable evolutionary computation.
arXiv Detail & Related papers (2024-05-06T16:17:33Z) - Trackable Agent-based Evolution Models at Wafer Scale [0.0]
We focus on the problem of extracting phylogenetic information from agent-based evolution on the 850,000 processor Cerebras Wafer Scale Engine (WSE)
We present an asynchronous island-based genetic algorithm (GA) framework for WSE hardware.
We validate phylogenetic reconstructions from these trials and demonstrate their suitability for inference of underlying evolutionary conditions.
arXiv Detail & Related papers (2024-04-16T19:24:14Z) - DARLEI: Deep Accelerated Reinforcement Learning with Evolutionary
Intelligence [77.78795329701367]
We present DARLEI, a framework that combines evolutionary algorithms with parallelized reinforcement learning.
We characterize DARLEI's performance under various conditions, revealing factors impacting diversity of evolved morphologies.
We hope to extend DARLEI in future work to include interactions between diverse morphologies in richer environments.
arXiv Detail & Related papers (2023-12-08T16:51:10Z) - PhyloGFN: Phylogenetic inference with generative flow networks [57.104166650526416]
We introduce the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and phylogenetic inference.
Because GFlowNets are well-suited for sampling complex structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies.
We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets.
arXiv Detail & Related papers (2023-10-12T23:46:08Z) - Phylogeny-informed fitness estimation [58.720142291102135]
We propose phylogeny-informed fitness estimation, which exploits a population's phylogeny to estimate fitness evaluations.
Our results indicate that phylogeny-informed fitness estimation can mitigate the drawbacks of down-sampled lexicase.
This work serves as an initial step toward improving evolutionary algorithms by exploiting runtime phylogenetic analysis.
arXiv Detail & Related papers (2023-06-06T19:05:01Z) - Deep metric learning improves lab of origin prediction of genetically
engineered plasmids [63.05016513788047]
Genetic engineering attribution (GEA) is the ability to make sequence-lab associations.
We propose a method, based on metric learning, that ranks the most likely labs-of-origin.
We are able to extract key signatures in plasmid sequences for particular labs, allowing for an interpretable examination of the model's outputs.
arXiv Detail & Related papers (2021-11-24T16:29:03Z) - Modeling Protein Using Large-scale Pretrain Language Model [12.568452480689578]
Interdisciplinary researchers have begun to leverage deep learning methods to model large biological datasets.
Inspired by the similarity between natural language and protein sequences, we use large-scale language models to model evolutionary-scale protein sequences.
Our model can accurately capture evolution information from pretraining on evolutionary-scale individual sequences.
arXiv Detail & Related papers (2021-08-17T04:13:11Z) - Epigenetic evolution of deep convolutional models [81.21462458089142]
We build upon a previously proposed neuroevolution framework to evolve deep convolutional models.
We propose a convolutional layer layout which allows kernels of different shapes and sizes to coexist within the same layer.
The proposed layout enables the size and shape of individual kernels within a convolutional layer to be evolved with a corresponding new mutation operator.
arXiv Detail & Related papers (2021-04-12T12:45:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.