On The Finetuning of MLIPs Through the Lens of Iterated Maps With BPTT
- URL: http://arxiv.org/abs/2512.01067v1
- Date: Sun, 30 Nov 2025 20:34:37 GMT
- Title: On The Finetuning of MLIPs Through the Lens of Iterated Maps With BPTT
- Authors: Evan Dramko, Yizhi Zhu, Aleksandar Krivokapic, Geoffroy Hautier, Thomas Reps, Christopher Jermaine, Anastasios Kyrillidis,
- Abstract summary: Traditional approaches to training MLIPs for structural relaxations involve training models to reproduce first-principles computed forces.<n>We propose a fine-tuning method to be used on a pretrained MLIP in which we create a fully-differentiable end-to-end simulation loop.<n>We show that this method achieves substantial performance gains when applied to pretrained models.
- Score: 40.134801761022324
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vital to the creation of advanced materials is performing structural relaxations. Traditional approaches built on physics-derived first-principles calculations are computationally expensive, motivating the creation of machine-learning interatomic potentials (MLIPs). Traditional approaches to training MLIPs for structural relaxations involves training models to faithfully reproduce first-principles computed forces. We propose a fine-tuning method to be used on a pretrained MLIP in which we create a fully-differentiable end-to-end simulation loop that optimizes the predicted final structures directly. Trajectories are unrolled and gradients are tracked through the entire relaxation. We show that this method achieves substantial performance gains when applied to pretrained models, leading to a nearly $50\%$ reduction in test error across the sample datasets. Interestingly, we show the process is robust to substantial variation in the relaxation setup, achieving negligibly different results across varied hyperparameter and procedural modifications. Experimental results indicate this is due to a ``preference'' of BPTT to modify the MLIP rather than the other trainable parameters. Of particular interest to practitioners is that this approach lowers the data requirements for producing an effective domain-specific MLIP, addressing a common bottleneck in practical deployment.
Related papers
- LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization [7.8639568562295965]
We consider small-data, large-scale decision problems in which a firm must make many operational decisions simultaneously.<n>We propose a pretrain-then-finetune approach built on a designed Transformer model to address this challenge.
arXiv Detail & Related papers (2026-02-03T16:08:33Z) - A Machine Learning Approach to Generate Residual Stress Distributions using Sparse Characterization Data in Friction-Stir Processed Parts [0.0]
Residual stresses, which remain within a component after processing, can deteriorate performance.<n>This work proposes a machine learning (ML) based Residual Stress Generator (RSG) to infer full-field stresses from limited measurements.
arXiv Detail & Related papers (2025-06-09T20:26:57Z) - Self-Refining Training for Amortized Density Functional Theory [5.5541132320126945]
We propose a novel method that reduces the dependency of amortized DFT solvers on large pre-collected datasets by introducing a self-refining training strategy.<n>We derive our method as a minimization of the variational upper bound on the KL-divergence measuring the discrepancy between the generated samples and the target Boltzmann distribution defined by the ground state energy.
arXiv Detail & Related papers (2025-06-02T00:32:32Z) - RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE)<n>RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers.<n>Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z) - Provable Meta-Learning with Low-Rank Adaptations [37.120226706944926]
We introduce a framework for generic PEFT-based meta-learning to learn a model that can easily adapt to unseen tasks.<n>For linear models using LoRA, we show that standard retraining is provably suboptimal for finding an adaptable set of parameters.<n>We verify these theoretical insights through experiments on synthetic data as well as real-data vision and language tasks.
arXiv Detail & Related papers (2024-10-29T17:24:18Z) - Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models [68.23649978697027]
Forecast-PEFT is a fine-tuning strategy that freezes the majority of the model's parameters, focusing adjustments on newly introduced prompts and adapters.
Our experiments show that Forecast-PEFT outperforms traditional full fine-tuning methods in motion prediction tasks.
Forecast-FT further improves prediction performance, evidencing up to a 9.6% enhancement over conventional baseline methods.
arXiv Detail & Related papers (2024-07-28T19:18:59Z) - An Emulator for Fine-Tuning Large Language Models using Small Language
Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales.
We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training.
Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z) - Real-time simulation of viscoelastic tissue behavior with physics-guided
deep learning [0.8250374560598492]
We propose a deep learning method for predicting displacement fields of soft tissues with viselastic properties.
The proposed method achieves a better accuracy over the conventional CNN models.
It is hoped that the present investigation will help in filling the gap in applying deep learning in virtual reality.
arXiv Detail & Related papers (2023-01-11T18:17:10Z) - Transfer learning driven design optimization for inertial confinement
fusion [0.0]
Transfer learning is a promising approach to creating predictive models that incorporate simulation and experimental data into a common framework.
We demonstrate that this method is more efficient at optimizing designs than traditional model calibration techniques.
arXiv Detail & Related papers (2022-05-26T17:38:57Z) - Learning to predict metal deformations in hot-rolling processes [59.00006390882099]
Hot-rolling is a metal forming process that produces a cross-section from an input through a sequence of deformations.
In current practice, the rolling sequence and the geometry of their rolls are needed to achieve a given cross-section.
We propose a supervised learning approach to predict a given by a set of rolls with given geometry.
arXiv Detail & Related papers (2020-07-22T13:33:44Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.