Deep Generative Models of Evolution: SNP-level Population Adaptation by Genomic Linkage Incorporation
- URL: http://arxiv.org/abs/2507.20644v1
- Date: Mon, 28 Jul 2025 09:03:09 GMT
- Title: Deep Generative Models of Evolution: SNP-level Population Adaptation by Genomic Linkage Incorporation
- Authors: Julia Siekiera, Christian Schlötterer, Stefan Kramer,
- Abstract summary: We introduce a deep generative neural network that aims to model a concept of evolution based on empirical observations over time.<n>The proposed model estimates the distribution of allele frequency trajectories by embedding the observations from single nucleotide polymorphisms.<n> Evaluation on simulated E&R experiments demonstrates the model's ability to capture the distribution of allele frequency trajectories.
- Score: 2.6524539020042663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The investigation of allele frequency trajectories in populations evolving under controlled environmental pressures has become a popular approach to study evolutionary processes on the molecular level. Statistical models based on well-defined evolutionary concepts can be used to validate different hypotheses about empirical observations. Despite their popularity, classic statistical models like the Wright-Fisher model suffer from simplified assumptions such as the independence of selected loci along a chromosome and uncertainty about the parameters. Deep generative neural networks offer a powerful alternative known for the integration of multivariate dependencies and noise reduction. Due to their high data demands and challenging interpretability they have, so far, not been widely considered in the area of population genomics. To address the challenges in the area of Evolve and Resequencing experiments (E&R) based on pooled sequencing (Pool-Seq) data, we introduce a deep generative neural network that aims to model a concept of evolution based on empirical observations over time. The proposed model estimates the distribution of allele frequency trajectories by embedding the observations from single nucleotide polymorphisms (SNPs) with information from neighboring loci. Evaluation on simulated E&R experiments demonstrates the model's ability to capture the distribution of allele frequency trajectories and illustrates the representational power of deep generative models on the example of linkage disequilibrium (LD) estimation. Inspecting the internally learned representations enables estimating pairwise LD, which is typically inaccessible in Pool-Seq data. Our model provides competitive LD estimation in Pool-Seq data high degree of LD when compared to existing methods.
Related papers
- Multi-population GAN Training: Analyzing Co-Evolutionary Algorithms [0.3683202928838613]
Generative adversarial networks (GANs) are powerful generative models but remain challenging to train due to pathologies such as mode collapse and instability.<n>Recent research has explored co-evolutionary approaches, in which populations of generators and discriminators are evolved.<n>This paper presents an empirical analysis of different coevolutionary GAN training strategies, focusing on the impact of selection and replacement mechanisms.
arXiv Detail & Related papers (2025-07-17T14:19:17Z) - Spiking Neural Models for Decision-Making Tasks with Learning [0.29998889086656577]
We propose a biologically plausible Spiking Neural Network (SNN) model for decision-making that incorporates a learning mechanism.<n>This work provides a significant step toward integrating biologically relevant neural mechanisms into cognitive models.
arXiv Detail & Related papers (2025-06-10T09:19:40Z) - Brain-like variational inference [0.0]
Inference in both brains and machines can be formalized by maximizing the evidence lower bound (ELBO) in machine learning, or minimizing variational free energy (F) in neuroscience (ELBO = -F)<n>Here, we show that online natural gradient descent on F, under Poisson assumptions, leads to a recurrent spiking neural network that performs variational inference via membrane potential dynamics.<n>The resulting model -- the iterative Poisson variational autoencoder (iP-VAE) -- replaces the encoder network with local updates derived from natural gradient descent on F.
arXiv Detail & Related papers (2024-10-25T06:00:18Z) - Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold [83.18058549195855]
We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities.<n>In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depend on the microenvironment of cells specific to each patient.<n>We propose Meta Flow Matching (MFM), a practical approach to integrate along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations.
arXiv Detail & Related papers (2024-08-26T20:05:31Z) - Latent Variable Sequence Identification for Cognitive Models with Neural Network Estimators [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space.<n>Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain.
In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z) - DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive
Surveillance of COVID-19 Using Heterogeneous Features and their Interactions [2.30238915794052]
We propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days.
Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties.
arXiv Detail & Related papers (2020-07-31T23:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.