Related papers: Iterative Pretraining Framework for Interatomic Potentials

Iterative Pretraining Framework for Interatomic Potentials

URL: http://arxiv.org/abs/2507.20118v1
Date: Sun, 27 Jul 2025 03:59:41 GMT
Title: Iterative Pretraining Framework for Interatomic Potentials
Authors: Taoyong Cui, Zhongyao Wang, Dongzhan Zhou, Yuqiang Li, Lei Bai, Wanli Ouyang, Mao Su, Shufei Zhang,
Abstract summary: We propose Iterative Pretraining for Interatomic Potentials (IPIP) to improve predictive performance of MLIP models.<n>IPIP incorporates a forgetting mechanism to prevent iterative training from converging to suboptimal local minima.<n>Compared to general-purpose force fields, this approach achieves over 80% reduction in prediction error and up to 4x speedup in the challenging Mo-S-O system.
Score: 46.53683458224917
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning interatomic potentials (MLIPs) enable efficient molecular dynamics (MD) simulations with ab initio accuracy and have been applied across various domains in physical science. However, their performance often relies on large-scale labeled training data. While existing pretraining strategies can improve model performance, they often suffer from a mismatch between the objectives of pretraining and downstream tasks or rely on extensive labeled datasets and increasingly complex architectures to achieve broad generalization. To address these challenges, we propose Iterative Pretraining for Interatomic Potentials (IPIP), a framework designed to iteratively improve the predictive performance of MLIP models. IPIP incorporates a forgetting mechanism to prevent iterative training from converging to suboptimal local minima. Unlike general-purpose foundation models, which frequently underperform on specialized tasks due to a trade-off between generality and system-specific accuracy, IPIP achieves higher accuracy and efficiency using lightweight architectures. Compared to general-purpose force fields, this approach achieves over 80% reduction in prediction error and up to 4x speedup in the challenging Mo-S-O system, enabling fast and accurate simulations.

Related papers

Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting [63.8116386935854]
We demonstrate that state-of-the-art probabilistic skill requires neither intricate architectural constraints nor specialized trainings.<n>We introduce a scalable framework for learning multi-scale atmospheric dynamics by combining a directly downsampled latent space with a history-conditioned local projector.<n>We find that our framework design is robust to the choice of probabilistic estimators, seamlessly supporting interpolants, diffusion models, and CRPS-based ensemble training.
arXiv Detail & Related papers (2026-01-26T03:52:16Z)
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning [58.533203990515034]
Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL)<n>We show that dynamic sparse training strategies provide module-specific benefits that complement the primary scalability foundation established by architectural improvements.<n>We finally distill these insights into Module-Specific Training (MST), a practical framework that exploits the benefits of architectural improvements and demonstrates substantial scalability gains across diverse RL algorithms without algorithmic modifications.
arXiv Detail & Related papers (2025-10-14T03:03:08Z)
Deep Hierarchical Learning with Nested Subspace Networks [53.71337604556311]
We propose Nested Subspace Networks (NSNs) for large neural networks.<n>NSNs enable a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets.<n>We show that NSNs can be surgically applied to pre-trained LLMs and unlock a smooth and predictable compute-performance frontier.
arXiv Detail & Related papers (2025-09-22T15:13:14Z)
U-PINet: End-to-End Hierarchical Physics-Informed Learning With Sparse Graph Coupling for 3D EM Scattering Modeling [28.64166932076228]
Electromagnetic (EM) scattering modeling is critical for radar remote sensing.<n>Traditional numerical solvers offer high accuracy, but suffer from scalability issues and substantial computational costs.<n>We propose a U-shaped Physics-Informed Network (U-PINet) to overcome these limitations.
arXiv Detail & Related papers (2025-08-05T12:20:42Z)
PMNO: A novel physics guided multi-step neural operator predictor for partial differential equations [23.04840527974364]
We propose a novel physics guided multi-step neural operator (PMNO) architecture to address challenges in long-horizon prediction of complex physical systems.<n>The PMNO framework replaces the single-step input with multi-step historical data in the forward pass and introduces an implicit time-stepping scheme during backpropagation.<n>We demonstrate the superior predictive performance of PMNO predictor across a diverse range of physical systems.
arXiv Detail & Related papers (2025-06-02T12:33:50Z)
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
A Multi-Fidelity Graph U-Net Model for Accelerated Physics Simulations [1.2430809884830318]
We propose a novel GNN architecture, Multi-Fidelity U-Net, that utilizes the advantages of the multi-fidelity methods for enhancing the performance of the GNN model.<n>We show that the proposed approach performs significantly better in accuracy and data requirement.<n>We also present Multi-Fidelity U-Net Lite, a faster version of the proposed architecture, with 35% faster training, with 2 to 5% reduction in accuracy.
arXiv Detail & Related papers (2024-12-19T20:09:38Z)
The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains [4.340917737559795]
We study scaling in Neural Network Interatomic Potentials (NNIPs) NNIPs act as surrogate models for ab initio quantum mechanical calculations. We develop an NNIP architecture designed for scaling: the Efficiently Scaled Attention Interatomic Potential (EScAIP)
arXiv Detail & Related papers (2024-10-31T17:35:57Z)
Physics-Informed Weakly Supervised Learning for Interatomic Potentials [17.165117198519248]
We introduce a physics-informed, weakly supervised approach for training machine-learned interatomic potentials (MLIPs)<n>We demonstrate reduced energy and force errors -- often lower by a factor of two -- for various baseline models and benchmark data sets.<n>Our approach improves the fine-tuning of foundation models on sparse, highly accurate ab initio data.
arXiv Detail & Related papers (2024-07-23T12:49:04Z)
Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives. We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis. We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z)
Deep learning enhanced mixed integer optimization: Learning to reduce model dimensionality [0.0]
This work introduces a framework to address the computational complexity inherent in Mixed-Integer Programming. By employing deep learning, we construct problem-specific models that identify and exploit common structures across MIP instances. We present an algorithm for generating synthetic data enhancing the robustness and generalizability of our models.
arXiv Detail & Related papers (2024-01-17T19:15:13Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Adversarial Self-Attention for Language Understanding [89.265747130584]
This paper proposes textitAdversarial Self-Attention mechanism (ASA). ASA adversarially reconstructs the Transformer attentions and facilitates model training from contaminated model structures. For fine-tuning, ASA-empowered models consistently outweigh naive models by a large margin considering both generalization and robustness.
arXiv Detail & Related papers (2022-06-25T09:18:10Z)
Predictive Coding Approximates Backprop along Arbitrary Computation Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents. Our models perform equivalently to backprop on challenging machine learning benchmarks. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.