Embed and Emulate: Contrastive representations for simulation-based inference
- URL: http://arxiv.org/abs/2409.18402v1
- Date: Fri, 27 Sep 2024 02:37:01 GMT
- Title: Embed and Emulate: Contrastive representations for simulation-based inference
- Authors: Ruoxi Jiang, Peter Y. Lu, Rebecca Willett,
- Abstract summary: This paper introduces Embed and Emulate (E&E), a new simulation-based inference ( SBI) method based on contrastive learning.
E&E learns a low-dimensional latent embedding of the data and a corresponding fast emulator in the latent space.
We demonstrate superior performance over existing methods in a realistic, non-identifiable parameter estimation task.
- Score: 11.543221890134399
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scientific modeling and engineering applications rely heavily on parameter estimation methods to fit physical models and calibrate numerical simulations using real-world measurements. In the absence of analytic statistical models with tractable likelihoods, modern simulation-based inference (SBI) methods first use a numerical simulator to generate a dataset of parameters and simulated outputs. This dataset is then used to approximate the likelihood and estimate the system parameters given observation data. Several SBI methods employ machine learning emulators to accelerate data generation and parameter estimation. However, applying these approaches to high-dimensional physical systems remains challenging due to the cost and complexity of training high-dimensional emulators. This paper introduces Embed and Emulate (E&E): a new SBI method based on contrastive learning that efficiently handles high-dimensional data and complex, multimodal parameter posteriors. E&E learns a low-dimensional latent embedding of the data (i.e., a summary statistic) and a corresponding fast emulator in the latent space, eliminating the need to run expensive simulations or a high dimensional emulator during inference. We illustrate the theoretical properties of the learned latent space through a synthetic experiment and demonstrate superior performance over existing methods in a realistic, non-identifiable parameter estimation task using the high-dimensional, chaotic Lorenz 96 system.
Related papers
- Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Addressing computational challenges in physical system simulations with
machine learning [0.0]
We present a machine learning-based data generator framework tailored to aid researchers who utilize simulations to examine various physical systems or processes.
Our approach involves a two-step process: first, we train a supervised predictive model using a limited simulated dataset to predict simulation outcomes.
Subsequently, a reinforcement learning agent is trained to generate accurate, simulation-like data by leveraging the supervised model.
arXiv Detail & Related papers (2023-05-16T17:31:50Z) - Embed and Emulate: Learning to estimate parameters of dynamical systems
with uncertainty quantification [11.353411236854582]
This paper explores learning emulators for parameter estimation with uncertainty estimation of high-dimensional dynamical systems.
Our task is to accurately estimate a range of likely values of the underlying parameters.
On a coupled 396-dimensional multiscale Lorenz 96 system, our method significantly outperforms a typical parameter estimation method.
arXiv Detail & Related papers (2022-11-03T01:59:20Z) - Neural Posterior Estimation with Differentiable Simulators [58.720142291102135]
We present a new method to perform Neural Posterior Estimation (NPE) with a differentiable simulator.
We demonstrate how gradient information helps constrain the shape of the posterior and improves sample-efficiency.
arXiv Detail & Related papers (2022-07-12T16:08:04Z) - Synthetic Data-Based Simulators for Recommender Systems: A Survey [55.60116686945561]
This survey aims at providing a comprehensive overview of the recent trends in the field of modeling and simulation.
We start with the motivation behind the development of frameworks implementing the simulations -- simulators.
We provide a new consistent classification of existing simulators based on their functionality, approbation, and industrial effectiveness.
arXiv Detail & Related papers (2022-06-22T19:33:21Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Deep Bayesian Active Learning for Accelerating Stochastic Simulation [74.58219903138301]
Interactive Neural Process (INP) is a deep active learning framework for simulations and with active learning approaches.
For active learning, we propose a novel acquisition function, Latent Information Gain (LIG), calculated in the latent space of NP based models.
The results demonstrate STNP outperforms the baselines in the learning setting and LIG achieves the state-of-the-art for active learning.
arXiv Detail & Related papers (2021-06-05T01:31:51Z) - Simulation-based inference methods for particle physics [12.451050883955071]
We explain why the likelihood function of high-dimensional LHC data cannot be explicitly evaluated, why this matters for data analysis, and reframe what the field has traditionally done to circumvent this problem.
We then review new simulation-based inference methods that let us directly analyze high-dimensional data by combining machine learning techniques and information from the simulator.
arXiv Detail & Related papers (2020-10-13T14:55:28Z) - Using Machine Learning to Emulate Agent-Based Simulations [0.0]
We evaluate the performance of multiple machine-learning methods as statistical emulators for use in the analysis of agent-based models (ABMs)
We propose that agent-based modelling would benefit from using machine-learning methods for emulation, as this can facilitate more robust sensitivity analyses for the models.
arXiv Detail & Related papers (2020-05-05T11:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.