Learning Collective Variables with Synthetic Data Augmentation through Physics-Inspired Geodesic Interpolation
- URL: http://arxiv.org/abs/2402.01542v4
- Date: Fri, 19 Jul 2024 17:48:10 GMT
- Title: Learning Collective Variables with Synthetic Data Augmentation through Physics-Inspired Geodesic Interpolation
- Authors: Soojung Yang, Juno Nam, Johannes C. B. Dietschreit, Rafael Gómez-Bombarelli,
- Abstract summary: In molecular dynamics simulations, rare events, such as protein folding, are typically studied using enhanced sampling techniques.
We propose a simulation-free data augmentation strategy using physics-inspired metrics to generate geodesics resembling protein folding transitions.
- Score: 1.4972659820929493
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In molecular dynamics simulations, rare events, such as protein folding, are typically studied using enhanced sampling techniques, most of which are based on the definition of a collective variable (CV) along which acceleration occurs. Obtaining an expressive CV is crucial, but often hindered by the lack of information about the particular event, e.g., the transition from unfolded to folded conformation. We propose a simulation-free data augmentation strategy using physics-inspired metrics to generate geodesic interpolations resembling protein folding transitions, thereby improving sampling efficiency without true transition state samples. This new data can be used to improve the accuracy of classifier-based methods. Alternatively, a regression-based learning scheme for CV models can be adopted by leveraging the interpolation progress parameter.
Related papers
- Data-driven path collective variables [0.0]
We propose a new method for the generation, optimization, and comparison of collective variables.
The resulting collective variable is one-dimensional, interpretable, and differentiable.
We demonstrate the validity of the method on two different applications.
arXiv Detail & Related papers (2023-12-21T14:07:47Z) - Gradient-Based Feature Learning under Structured Data [57.76552698981579]
In the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction.
We show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue.
In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent.
arXiv Detail & Related papers (2023-09-07T16:55:50Z) - Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image
Compression [63.56922682378755]
We focus on extending spatial aggregation capability and propose a dynamic kernel-based transform coding.
The proposed adaptive aggregation generates kernel offsets to capture valid information in the content-conditioned range to help transform.
Experimental results demonstrate that our method achieves superior rate-distortion performance on three benchmarks compared to the state-of-the-art learning-based methods.
arXiv Detail & Related papers (2023-08-17T01:34:51Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Reweighted Manifold Learning of Collective Variables from Enhanced Sampling Simulations [2.6009298669020477]
We provide a framework based on anisotropic diffusion maps for manifold learning.
We show that our framework reverts the biasing effect yielding CVs that correctly describe the equilibrium density.
We show that it can be used in many manifold learning techniques on data from both standard and enhanced sampling simulations.
arXiv Detail & Related papers (2022-07-29T08:59:56Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Scalable nonparametric Bayesian learning for heterogeneous and dynamic
velocity fields [8.744017403796406]
We develop a model for learning heterogeneous and dynamic patterns of velocity field data.
We show the effectiveness of our techniques to the NGSIM dataset of complex multi-vehicle interactions.
arXiv Detail & Related papers (2021-02-15T17:45:46Z) - Efficient Characterization of Dynamic Response Variation Using
Multi-Fidelity Data Fusion through Composite Neural Network [9.446974144044733]
We take advantage of the multi-level response prediction opportunity in structural dynamic analysis.
We formulate a composite neural network fusion approach that can fully utilize the multi-level, heterogeneous datasets obtained.
arXiv Detail & Related papers (2020-05-07T02:44:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.