Reweighted Manifold Learning of Collective Variables from Enhanced Sampling Simulations
- URL: http://arxiv.org/abs/2207.14554v2
- Date: Wed, 3 Apr 2024 10:39:21 GMT
- Title: Reweighted Manifold Learning of Collective Variables from Enhanced Sampling Simulations
- Authors: Jakub Rydzewski, Ming Chen, Tushar K. Ghosh, Omar Valsson,
- Abstract summary: We provide a framework based on anisotropic diffusion maps for manifold learning.
We show that our framework reverts the biasing effect yielding CVs that correctly describe the equilibrium density.
We show that it can be used in many manifold learning techniques on data from both standard and enhanced sampling simulations.
- Score: 2.6009298669020477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Enhanced sampling methods are indispensable in computational physics and chemistry, where atomistic simulations cannot exhaustively sample the high-dimensional configuration space of dynamical systems due to the sampling problem. A class of such enhanced sampling methods works by identifying a few slow degrees of freedom, termed collective variables (CVs), and enhancing the sampling along these CVs. Selecting CVs to analyze and drive the sampling is not trivial and often relies on physical and chemical intuition. Despite routinely circumventing this issue using manifold learning to estimate CVs directly from standard simulations, such methods cannot provide mappings to a low-dimensional manifold from enhanced sampling simulations as the geometry and density of the learned manifold are biased. Here, we address this crucial issue and provide a general reweighting framework based on anisotropic diffusion maps for manifold learning that takes into account that the learning data set is sampled from a biased probability distribution. We consider manifold learning methods based on constructing a Markov chain describing transition probabilities between high-dimensional samples. We show that our framework reverts the biasing effect yielding CVs that correctly describe the equilibrium density. This advancement enables the construction of low-dimensional CVs using manifold learning directly from data generated by enhanced sampling simulations. We call our framework reweighted manifold learning. We show that it can be used in many manifold learning techniques on data from both standard and enhanced sampling simulations.
Related papers
- On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Learning Collective Variables with Synthetic Data Augmentation through Physics-Inspired Geodesic Interpolation [1.4972659820929493]
In molecular dynamics simulations, rare events, such as protein folding, are typically studied using enhanced sampling techniques.
We propose a simulation-free data augmentation strategy using physics-inspired metrics to generate geodesics resembling protein folding transitions.
arXiv Detail & Related papers (2024-02-02T16:35:02Z) - Unbiasing Enhanced Sampling on a High-dimensional Free Energy Surface
with Deep Generative Model [8.733395226793029]
We propose an unbiasing method based on the score-based diffusion model.
We test the score-based diffusion unbiasing method on TAMD simulations.
arXiv Detail & Related papers (2023-12-14T23:53:34Z) - Score-based generative models learn manifold-like structures with
constrained mixing [2.843124313496295]
How do score-based generative models learn the data distribution supported on a low-dimensional manifold?
We investigate the score model of a trained SBM through its linear approximations and subspaces spanned by local feature vectors.
We find that the learned vector field mixes samples by a non-conservative field within the manifold, although it denoises with normal projections as if there is an energy function in off-manifold directions.
arXiv Detail & Related papers (2023-11-16T15:15:15Z) - Manifold Learning with Sparse Regularised Optimal Transport [0.17205106391379024]
Real-world datasets are subject to noisy observations and sampling, so that distilling information about the underlying manifold is a major challenge.
We propose a method for manifold learning that utilises a symmetric version of optimal transport with a quadratic regularisation.
We prove that the resulting kernel is consistent with a Laplace-type operator in the continuous limit, establish robustness to heteroskedastic noise and exhibit these results in simulations.
arXiv Detail & Related papers (2023-07-19T08:05:46Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Unsupervised Learning of Sampling Distributions for Particle Filters [80.6716888175925]
We put forward four methods for learning sampling distributions from observed measurements.
Experiments demonstrate that learned sampling distributions exhibit better performance than designed, minimum-degeneracy sampling distributions.
arXiv Detail & Related papers (2023-02-02T15:50:21Z) - Nonlinear Isometric Manifold Learning for Injective Normalizing Flows [58.720142291102135]
We use isometries to separate manifold learning and density estimation.
We also employ autoencoders to design embeddings with explicit inverses that do not distort the probability distribution.
arXiv Detail & Related papers (2022-03-08T08:57:43Z) - Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions.
We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions.
We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z) - Flows for simultaneous manifold learning and density estimation [12.451050883955071]
manifold-learning flows (M-flows) represent datasets with a manifold structure more faithfully.
M-flows learn the data manifold and allow for better inference than standard flows in the ambient data space.
arXiv Detail & Related papers (2020-03-31T02:07:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.