Physics-Learning AI Datamodel (PLAID) datasets: a collection of physics simulations for machine learning
- URL: http://arxiv.org/abs/2505.02974v2
- Date: Thu, 08 May 2025 12:58:22 GMT
- Title: Physics-Learning AI Datamodel (PLAID) datasets: a collection of physics simulations for machine learning
- Authors: Fabien Casenave, Xavier Roynard, Brian Staber, William Piat, Michele Alessandro Bucci, Nissrine Akkari, Abbas Kabalan, Xuan Minh Vuong Nguyen, Luca Saverio, Raphaël Carpintero Perez, Anthony Kalaydjian, Samy Fouché, Thierry Gonon, Ghassan Najjar, Emmanuel Menier, Matthieu Nastorg, Giovanni Catalani, Christian Rey,
- Abstract summary: PLAID is a framework for representing and sharing datasets of physics simulations.<n> PLAID defines a unified standard for describing simulation data.<n>We release six datasets under the PLAID standard, covering structural mechanics and computational fluid dynamics.
- Score: 0.15469999759898032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning-based surrogate models have emerged as a powerful tool to accelerate simulation-driven scientific workflows. However, their widespread adoption is hindered by the lack of large-scale, diverse, and standardized datasets tailored to physics-based simulations. While existing initiatives provide valuable contributions, many are limited in scope-focusing on specific physics domains, relying on fragmented tooling, or adhering to overly simplistic datamodels that restrict generalization. To address these limitations, we introduce PLAID (Physics-Learning AI Datamodel), a flexible and extensible framework for representing and sharing datasets of physics simulations. PLAID defines a unified standard for describing simulation data and is accompanied by a library for creating, reading, and manipulating complex datasets across a wide range of physical use cases (gitlab.com/drti/plaid). We release six carefully crafted datasets under the PLAID standard, covering structural mechanics and computational fluid dynamics, and provide baseline benchmarks using representative learning methods. Benchmarking tools are made available on Hugging Face, enabling direct participation by the community and contribution to ongoing evaluation efforts (huggingface.co/PLAIDcompetitions).
Related papers
- GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects [55.02281855589641]
GausSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.<n>We leverage continuum mechanics and treat each kernel as a Center of Mass System (CMS) that represents continuous piece of matter.<n>In addition, GausSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning [4.812580392361432]
Well is a large-scale collection of numerical simulations of a wide variety of physical systems.<n>These datasets can be used individually or as part of a broader benchmark suite.<n>We provide a unified PyTorch interface for training and evaluating models.
arXiv Detail & Related papers (2024-11-30T19:42:14Z) - MBDS: A Multi-Body Dynamics Simulation Dataset for Graph Networks Simulators [4.5353840616537555]
Graph Network Simulators (GNS) have emerged as the leading method for modeling physical phenomena.
We have constructed a high-quality physical simulation dataset encompassing 1D, 2D, and 3D scenes.
A key feature of our dataset is the inclusion of precise multi-body dynamics, facilitating a more realistic simulation of the physical world.
arXiv Detail & Related papers (2024-10-04T03:03:06Z) - A Benchmark Time Series Dataset for Semiconductor Fabrication Manufacturing Constructed using Component-based Discrete-Event Simulation Models [0.0]
This research is based on a benchmark model of an Intel semiconductor fabrication factory.
The time series dataset is constructed using discrete-event time trajectories.
The dataset can also be utilized in the machine learning community for behavioral analysis.
arXiv Detail & Related papers (2024-08-17T23:05:47Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Modular machine learning-based elastoplasticity: generalization in the
context of limited data [0.0]
We discuss a hybrid framework that can work on a variable amount of data by relying on the modularity of the elastoplasticity formulation.
The discovered material models are found to not only interpolate well but also allow for accurate extrapolation in a thermodynamically consistent manner far outside the domain of the training data.
arXiv Detail & Related papers (2022-10-15T17:35:23Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Using Deep Learning to Explore Local Physical Similarity for
Global-scale Bridging in Thermal-hydraulic Simulation [4.350727579753697]
Current system thermal-hydraulic codes have limited credibility in simulating real plant conditions.
This paper proposes a data-driven approach, Feature Similarity Measurement FFSM, to overcome these difficulties.
Deep learning is applied to construct and explore the relationship between the local physical features and simulation errors.
arXiv Detail & Related papers (2020-01-06T20:14:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.