Amortized Sampling with Transferable Normalizing Flows
- URL: http://arxiv.org/abs/2508.18175v1
- Date: Mon, 25 Aug 2025 16:28:18 GMT
- Title: Amortized Sampling with Transferable Normalizing Flows
- Authors: Charlie B. Tan, Majdi Hassan, Leon Klein, Saifuddin Syed, Dominique Beaini, Michael M. Bronstein, Alexander Tong, Kirill Neklyudov,
- Abstract summary: Prose is a transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length.<n>We show that Prose is a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve superior performance.<n>We open-source the Prose dataset to further stimulate research into amortized sampling methods and finetuning objectives.
- Score: 65.48838168417564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches such as molecular dynamics or Markov chain Monte Carlo inherently lack amortization; the computational cost of sampling must be paid in-full for each system of interest. The widespread success of generative models has inspired interest into overcoming this limitation through learning sampling algorithms. Despite performing on par with conventional methods when trained on a single system, learned samplers have so far demonstrated limited ability to transfer across systems. We prove that deep learning enables the design of scalable and transferable samplers by introducing Prose, a 280 million parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. Prose draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation we demonstrate the efficacy of Prose as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve superior performance to established methods such as sequential Monte Carlo on unseen tetrapeptides. We open-source the Prose codebase, model weights, and training dataset, to further stimulate research into amortized sampling methods and finetuning objectives.
Related papers
- TFTF: Training-Free Targeted Flow for Conditional Sampling [1.4151684142137693]
We propose a training-free conditional sampling method for flow matching models based on importance sampling.<n>Because a nave application of importance sampling suffers from weighteneracy in high-dimensional settings, we modify and incorporate a resampling technique in sequential Monte Carlo.<n>Our framework requires no additional training, while providing theoretical guarantees of accuracy.
arXiv Detail & Related papers (2026-02-13T13:41:35Z) - Reinforced sequential Monte Carlo for amortised sampling [49.92678178064033]
We state a connection between sequential Monte Carlo (SMC) and neural sequential samplers trained by maximum-entropy reinforcement learning (MaxEnt RL)<n>We describe techniques for stable joint training of proposals and twist functions and an adaptive weight tempering scheme to reduce training signal variance.
arXiv Detail & Related papers (2025-10-13T17:59:11Z) - FORT: Forward-Only Regression Training of Normalizing Flows [85.66894616735752]
We revisit classical normalizing flows as one-step generative models with exact likelihoods.<n>We propose a novel, scalable training objective that does not require computing the expensive change of variable formula used in conventional maximum likelihood training.
arXiv Detail & Related papers (2025-06-01T20:32:27Z) - Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching [33.9461078261722]
We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities.<n>We show how to incorporate key symmetries, as well as periodic boundary conditions, for modeling molecules in both cartesian and torsional coordinates.<n>We demonstrate the effectiveness of our approach through extensive experiments on classical energy functions, and further scale up to neural network-based energy models.
arXiv Detail & Related papers (2025-04-16T02:20:06Z) - Neural Flow Samplers with Shortcut Models [19.81513273510523]
Continuous flow-based neural samplers offer a promising approach to generate samples from unnormalized densities.<n>We introduce an improved estimator for these challenging quantities, employing a velocity-driven Sequential Monte Carlo method.<n>Our proposed Neural Flow Shortcut Sampler empirically outperforms existing flow-based neural samplers on both synthetic datasets and complex n-body system targets.
arXiv Detail & Related papers (2025-02-11T07:55:41Z) - Adaptive teachers for amortized samplers [76.88721198565861]
We propose an adaptive training distribution (the teacher) to guide the training of the primary amortized sampler (the student)<n>We validate the effectiveness of this approach in a synthetic environment designed to present an exploration challenge.
arXiv Detail & Related papers (2024-10-02T11:33:13Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z) - Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions.
We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions.
We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z) - Stochastic Normalizing Flows [2.323220706791067]
We show that normalizing flows can be used to learn the transformation of a simple prior distribution.
We derive an efficient training procedure by which both the sampler's and the flow's parameters can be optimized end-to-end.
We illustrate the representational power, sampling efficiency and correctness of SNFs on several benchmarks including applications to molecular sampling systems in equilibrium.
arXiv Detail & Related papers (2020-02-16T23:29:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.