Evaluating Universal Machine Learning Force Fields Against Experimental Measurements
- URL: http://arxiv.org/abs/2508.05762v1
- Date: Thu, 07 Aug 2025 18:21:39 GMT
- Title: Evaluating Universal Machine Learning Force Fields Against Experimental Measurements
- Authors: Sajid Mannan, Vaibhav Bihani, Carmelo Gonzales, Kin Long Kelvin Lee, Nitya Nand Gosvami, Sayan Ranu, Santiago Miret, N M Anoop Krishnan,
- Abstract summary: Universal machine learning force fields (UMLFFs) promise to revolutionize materials science by enabling rapid atomistic simulations across the periodic table.<n>Here, we present UniFFBench, a comprehensive framework for evaluating experimental bondingFFs against experimental measurements of 1,500 carefully curated mineral structures.<n>Our systematic evaluation of six state-of-the-artFFs reveals a substantial reality gap: models achieving impressive performance on computational benchmarks often fail when confronted with experimental complexity.
- Score: 15.863801293927635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Universal machine learning force fields (UMLFFs) promise to revolutionize materials science by enabling rapid atomistic simulations across the periodic table. However, their evaluation has been limited to computational benchmarks that may not reflect real-world performance. Here, we present UniFFBench, a comprehensive framework for evaluating UMLFFs against experimental measurements of ~1,500 carefully curated mineral structures spanning diverse chemical environments, bonding types, structural complexity, and elastic properties. Our systematic evaluation of six state-of-the-art UMLFFs reveals a substantial reality gap: models achieving impressive performance on computational benchmarks often fail when confronted with experimental complexity. Even the best-performing models exhibit higher density prediction error than the threshold required for practical applications. Most strikingly, we observe disconnects between simulation stability and mechanical property accuracy, with prediction errors correlating with training data representation rather than the modeling method. These findings demonstrate that while current computational benchmarks provide valuable controlled comparisons, they may overestimate model reliability when extrapolated to experimentally complex chemical spaces. Altogether, UniFFBench establishes essential experimental validation standards and reveals systematic limitations that must be addressed to achieve truly universal force field capabilities.
Related papers
- Grounding LLMs in Scientific Discovery via Embodied Actions [84.11877211907647]
Large Language Models (LLMs) have shown significant potential in scientific discovery but struggle to bridge the gap between theoretical reasoning and physical simulation.<n>We propose EmbodiedAct, a framework that transforms established scientific software into active embodied agents by groundings in embodied actions with a tight perception-execution loop.
arXiv Detail & Related papers (2026-02-24T07:37:18Z) - Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z) - Surface Stability Modeling with Universal Machine Learning Interatomic Potentials: A Comprehensive Cleavage Energy Benchmarking Study [0.0]
Machine learning interatomic potentials (MLIPs) have revolutionized computational materials science.<n>No systematic evaluation has assessed how well these universal MLIPs can predict cleavage energies.<n>We present a benchmark of 19 state-of-the-art uMLIPs for cleavage energy prediction.
arXiv Detail & Related papers (2025-08-29T14:24:47Z) - Physics-Informed Multimodal Bearing Fault Classification under Variable Operating Conditions using Transfer Learning [0.46085106405479537]
This study proposes a physics-informed multimodal convolutional neural network (CNN) with a late fusion architecture.<n>The model incorporates a novel physics-informed loss function that penalizes physically implausible predictions.<n>Experiments on the Paderborn University dataset demonstrate that the proposed physics-informed approach consistently outperforms a non-physics-informed baseline.
arXiv Detail & Related papers (2025-08-11T01:32:09Z) - Towards Robust Surrogate Models: Benchmarking Machine Learning Approaches to Expediting Phase Field Simulations of Brittle Fracture [0.0]
We introduce a dataset based on PFM simulations designed to benchmark and advance ML methods for fracture modeling.<n>This dataset includes three energy decomposition methods, two boundary conditions, and 1,000 random initial crack configurations for a total of 6,000 simulations.<n>Our results highlight both the promise and limitations of popular current models, and demonstrate the utility of this dataset as a testbed for advancing machine learning in fracture mechanics research.
arXiv Detail & Related papers (2025-07-09T19:14:56Z) - MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback [136.27567671480156]
We introduce experiment-guided ranking, which prioritizes hypotheses based on feedback from prior tests.<n>We frame experiment-guided ranking as a sequential decision-making problem.<n>Our approach significantly outperforms pre-experiment baselines and strong ablations.
arXiv Detail & Related papers (2025-05-23T13:24:50Z) - To Use or Not to Use a Universal Force Field [1.25431689228423]
Machine learning force fields (MLFFs) have emerged as powerful tools for molecular dynamics (MD) simulations.<n>This Perspective evaluates the viability of universal MLFFs for simulating complex materials systems.
arXiv Detail & Related papers (2025-03-11T09:23:01Z) - GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects [55.02281855589641]
GausSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.<n>We leverage continuum mechanics and treat each kernel as a Center of Mass System (CMS) that represents continuous piece of matter.<n>In addition, GausSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - Generalizability of Graph Neural Network Force Fields for Predicting Solid-State Properties [8.405078403907241]
Machine-learned force fields (MLFFs) promise to offer a computationally efficient alternative to ab initio simulations for complex molecular systems.<n>This work investigates the ability of a graph neural network (GNN)-based MLFF to describe solid-state phenomena not explicitly included during training.
arXiv Detail & Related papers (2024-09-16T02:14:26Z) - Fast and Reliable Probabilistic Reflectometry Inversion with Prior-Amortized Neural Posterior Estimation [73.81105275628751]
Finding all structures compatible with reflectometry data is computationally prohibitive for standard algorithms.
We address this lack of reliability with a probabilistic deep learning method that identifies all realistic structures in seconds.
Our method, Prior-Amortized Neural Posterior Estimation (PANPE), combines simulation-based inference with novel adaptive priors.
arXiv Detail & Related papers (2024-07-26T10:29:16Z) - Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.
This paper investigates the robustness of existing CLTR models in complex and diverse situations.
We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Accurate machine learning force fields via experimental and simulation
data fusion [0.0]
Machine Learning (ML)-based force fields are attracting ever-increasing interest due to their capacity to span scales of classical interatomic potentials at quantum-level accuracy.
Here we leverage both Density Functional Theory (DFT) calculations and experimentally measured mechanical properties and lattice parameters to train an ML potential of titanium.
We demonstrate that the fused data learning strategy can concurrently satisfy all target objectives, thus resulting in a molecular model of higher accuracy compared to the models trained with a single source data.
arXiv Detail & Related papers (2023-08-17T18:22:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.