The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
- URL: http://arxiv.org/abs/2412.00568v1
- Date: Sat, 30 Nov 2024 19:42:14 GMT
- Title: The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
- Authors: Ruben Ohana, Michael McCabe, Lucas Meyer, Rudy Morel, Fruzsina J. Agocs, Miguel Beneitez, Marsha Berger, Blakesley Burkhart, Stuart B. Dalziel, Drummond B. Fielding, Daniel Fortunato, Jared A. Goldberg, Keiya Hirashima, Yan-Fei Jiang, Rich R. Kerswell, Suryanarayana Maddu, Jonah Miller, Payel Mukhopadhyay, Stefan S. Nixon, Jeff Shen, Romain Watteaux, Bruno Régaldo-Saint Blancard, François Rozet, Liam H. Parker, Miles Cranmer, Shirley Ho,
- Abstract summary: Well is a large-scale collection of numerical simulations of a wide variety of physical systems.
These datasets can be used individually or as part of a broader benchmark suite.
We provide a unified PyTorch interface for training and evaluating models.
- Score: 4.861642399720316
- License:
- Abstract: Machine learning based surrogate models offer researchers powerful tools for accelerating simulation-based workflows. However, as standard datasets in this space often cover small classes of physical behavior, it can be difficult to evaluate the efficacy of new approaches. To address this gap, we introduce the Well: a large-scale collection of datasets containing numerical simulations of a wide variety of spatiotemporal physical systems. The Well draws from domain experts and numerical software developers to provide 15TB of data across 16 datasets covering diverse domains such as biological systems, fluid dynamics, acoustic scattering, as well as magneto-hydrodynamic simulations of extra-galactic fluids or supernova explosions. These datasets can be used individually or as part of a broader benchmark suite. To facilitate usage of the Well, we provide a unified PyTorch interface for training and evaluating models. We demonstrate the function of this library by introducing example baselines that highlight the new challenges posed by the complex dynamics of the Well. The code and data is available at https://github.com/PolymathicAI/the_well.
Related papers
- GauSim: Registering Elastic Objects into Digital World by Gaussian Simulator [55.02281855589641]
GauSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.
We leverage continuum mechanics, modeling each kernel as a continuous piece of matter to account for realistic deformations without idealized assumptions.
GauSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - CoDBench: A Critical Evaluation of Data-driven Models for Continuous
Dynamical Systems [8.410938527671341]
We introduce CodBench, an exhaustive benchmarking suite comprising 11 state-of-the-art data-driven models for solving differential equations.
Specifically, we evaluate 4 distinct categories of models, viz., feed forward neural networks, deep operator regression models, frequency-based neural operators, and transformer architectures.
We conduct extensive experiments, assessing the operators' capabilities in learning, zero-shot super-resolution, data efficiency, robustness to noise, and computational efficiency.
arXiv Detail & Related papers (2023-10-02T21:27:54Z) - Eagle: Large-Scale Learning of Turbulent Fluid Dynamics with Mesh
Transformers [23.589419066824306]
Estimating fluid dynamics is a notoriously hard problem to solve.
We introduce a new model, method and benchmark for the problem.
We show that our transformer outperforms state-of-the-art performance on, both, existing synthetic and real datasets.
arXiv Detail & Related papers (2023-02-16T12:59:08Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - SUPA: A Lightweight Diagnostic Simulator for Machine Learning in
Particle Physics [0.0]
SUPA is an algorithm and software package for generating data by simulating simplified particle propagation, scattering and shower development in matter.
The proposed simulator generates thousands of particle showers per second on a desktop machine, a speed up of up to 6 orders of magnitudes over Geant4.
arXiv Detail & Related papers (2022-02-10T13:14:12Z) - Deeptime: a Python library for machine learning dynamical models from
time series data [3.346668383314945]
Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data.
In this paper we introduce the main features and structure of the deeptime software.
arXiv Detail & Related papers (2021-10-28T10:53:03Z) - PREPRINT: Comparison of deep learning and hand crafted features for
mining simulation data [7.214140640112874]
This paper addresses the task of extracting meaningful results in an automated manner from high dimensional data sets.
We propose deep learning methods which are capable of processing such data and which can be trained to solve relevant tasks on simulation data.
We compile a large dataset of 2D simulations of the flow field around airfoils which contains 16000 flow fields with which we tested and compared approaches.
arXiv Detail & Related papers (2021-03-11T09:28:00Z) - Data-Efficient Learning for Complex and Real-Time Physical Problem
Solving using Augmented Simulation [49.631034790080406]
We present a task for navigating a marble to the center of a circular maze.
We present a model that learns to move a marble in the complex environment within minutes of interacting with the real system.
arXiv Detail & Related papers (2020-11-14T02:03:08Z) - Learning to Simulate Complex Physics with Graph Networks [68.43901833812448]
We present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains.
Our framework---which we term "Graph Network-based Simulators" (GNS)--represents the state of a physical system with particles, expressed as nodes in a graph, and computes dynamics via learned message-passing.
Our results show that our model can generalize from single-timestep predictions with thousands of particles during training, to different initial conditions, thousands of timesteps, and at least an order of magnitude more particles at test time.
arXiv Detail & Related papers (2020-02-21T16:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.