Data-driven path collective variables
- URL: http://arxiv.org/abs/2312.13868v1
- Date: Thu, 21 Dec 2023 14:07:47 GMT
- Title: Data-driven path collective variables
- Authors: Arthur France-Lanord, Hadrien Vroylandt, Mathieu Salanne, Benjamin
Rotenberg, A. Marco Saitta, Fabio Pietrucci
- Abstract summary: We propose a new method for the generation, optimization, and comparison of collective variables.
The resulting collective variable is one-dimensional, interpretable, and differentiable.
We demonstrate the validity of the method on two different applications.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying optimal collective variables to model transformations, using
atomic-scale simulations, is a long-standing challenge. We propose a new method
for the generation, optimization, and comparison of collective variables, which
can be thought of as a data-driven generalization of the path collective
variable concept. It consists in a kernel ridge regression of the committor
probability, which encodes a transformation's progress. The resulting
collective variable is one-dimensional, interpretable, and differentiable,
making it appropriate for enhanced sampling simulations requiring biasing. We
demonstrate the validity of the method on two different applications: a
precipitation model, and the association of Li$^+$ and F$^-$ in water. For the
former, we show that global descriptors such as the permutation invariant
vector allow to reach an accuracy far from the one achieved \textit{via}
simpler, more intuitive variables. For the latter, we show that information
correlated with the transformation mechanism is contained in the first
solvation shell only, and that inertial effects prevent the derivation of
optimal collective variables from the atomic positions only.
Related papers
- Unsupervised Representation Learning from Sparse Transformation Analysis [79.94858534887801]
We propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components.
Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model.
arXiv Detail & Related papers (2024-10-07T23:53:25Z) - Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention [88.45459681677369]
We propose a novel transformer variant with complex vector attention, named EulerFormer.
It provides a unified theoretical framework to formulate both semantic difference and positional difference.
It is more robust to semantic variations and possesses moresuperior theoretical properties in principle.
arXiv Detail & Related papers (2024-03-26T14:18:43Z) - Learning Collective Variables with Synthetic Data Augmentation through Physics-Inspired Geodesic Interpolation [1.4972659820929493]
In molecular dynamics simulations, rare events, such as protein folding, are typically studied using enhanced sampling techniques.
We propose a simulation-free data augmentation strategy using physics-inspired metrics to generate geodesics resembling protein folding transitions.
arXiv Detail & Related papers (2024-02-02T16:35:02Z) - Scalable variable selection for two-view learning tasks with projection
operators [0.0]
We propose a novel variable selection method for two-view settings, or for vector-valued supervised learning problems.
Our framework is able to handle extremely large scale selection tasks, where number of data samples could be even millions.
arXiv Detail & Related papers (2023-07-04T08:22:05Z) - A Quadrature Rule combining Control Variates and Adaptive Importance
Sampling [0.0]
We show that a simple weighted least squares approach can be used to improve the accuracy of Monte Carlo integration estimates.
Our main result is a non-asymptotic bound on the probabilistic error of the procedure.
The good behavior of the method is illustrated empirically on synthetic examples and real-world data for Bayesian linear regression.
arXiv Detail & Related papers (2022-05-24T08:21:45Z) - Unified Multivariate Gaussian Mixture for Efficient Neural Image
Compression [151.3826781154146]
latent variables with priors and hyperpriors is an essential problem in variational image compression.
We find inter-correlations and intra-correlations exist when observing latent variables in a vectorized perspective.
Our model has better rate-distortion performance and an impressive $3.18times$ compression speed up.
arXiv Detail & Related papers (2022-03-21T11:44:17Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - VarCLR: Variable Semantic Representation Pre-training via Contrastive
Learning [84.70916463298109]
VarCLR is a new approach for learning semantic representations of variable names.
VarCLR is an excellent fit for contrastive learning, which aims to minimize the distance between explicitly similar inputs.
We show that VarCLR enables the effective application of sophisticated, general-purpose language models like BERT.
arXiv Detail & Related papers (2021-12-05T18:40:32Z) - An Embedded Model Estimator for Non-Stationary Random Functions using
Multiple Secondary Variables [0.0]
This paper introduces the method and shows that it has consistency results that are similar in nature to those applying to geostatistical modelling and to Quantile Random Forests.
The algorithm works by estimating a conditional distribution for the target variable at each target location.
arXiv Detail & Related papers (2020-11-09T00:14:24Z) - Gaussianization Flows [113.79542218282282]
We propose a new type of normalizing flow model that enables both efficient iteration of likelihoods and efficient inversion for sample generation.
Because of this guaranteed expressivity, they can capture multimodal target distributions without compromising the efficiency of sample generation.
arXiv Detail & Related papers (2020-03-04T08:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.