Lifelong Machine Learning Potentials
- URL: http://arxiv.org/abs/2303.05911v2
- Date: Sun, 4 Jun 2023 15:57:19 GMT
- Title: Lifelong Machine Learning Potentials
- Authors: Marco Eckhoff and Markus Reiher
- Abstract summary: We introduce element-embracing atom-centered functions (eeACSFs) which combine structural properties and element information from the periodic table.
We apply continual learning strategies to enable autonomous and on-the-fly training on a continuous stream of new data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning potentials (MLPs) trained on accurate quantum chemical data
can retain the high accuracy, while inflicting little computational demands. On
the downside, they need to be trained for each individual system. In recent
years, a vast number of MLPs has been trained from scratch because learning
additional data typically requires to train again on all data to not forget
previously acquired knowledge. Additionally, most common structural descriptors
of MLPs cannot represent efficiently a large number of different chemical
elements. In this work, we tackle these problems by introducing
element-embracing atom-centered symmetry functions (eeACSFs) which combine
structural properties and element information from the periodic table. These
eeACSFs are a key for our development of a lifelong machine learning potential
(lMLP). Uncertainty quantification can be exploited to transgress a fixed,
pre-trained MLP to arrive at a continuously adapting lMLP, because a predefined
level of accuracy can be ensured. To extend the applicability of an lMLP to new
systems, we apply continual learning strategies to enable autonomous and
on-the-fly training on a continuous stream of new data. For the training of
deep neural networks, we propose the continual resilient (CoRe) optimizer and
incremental learning strategies relying on rehearsal of data, regularization of
parameters, and the architecture of the model.
Related papers
- Reparameterized LLM Training via Orthogonal Equivalence Transformation [54.80172809738605]
We present POET, a novel training algorithm that uses Orthogonal Equivalence Transformation to optimize neurons.<n>POET can stably optimize the objective function with improved generalization.<n>We develop efficient approximations that make POET flexible and scalable for training large-scale neural networks.
arXiv Detail & Related papers (2025-06-09T17:59:34Z) - Ensemble Knowledge Distillation for Machine Learning Interatomic Potentials [34.82692226532414]
Machine learning interatomic potentials (MLIPs) are a promising tool to accelerate atomistic simulations and molecular property prediction.
The quality of MLIPs depends on the quantity of available training data as well as the quantum chemistry (QC) level of theory used to generate that data.
We present an ensemble knowledge distillation (EKD) method to improve MLIP accuracy when trained to energy-only datasets.
arXiv Detail & Related papers (2025-03-18T14:32:51Z) - PIMRL: Physics-Informed Multi-Scale Recurrent Learning for Spatiotemporal Prediction [9.294766192549249]
The PIMRL framework embeds physical knowledge into neural networks via pretraining and adopts a data-driven approach to learn.
PIMRL consistently achieves state-of-the-art performance across five benchmark datasets ranging from one to three dimensions.
arXiv Detail & Related papers (2025-03-13T11:01:03Z) - Enhancing Machine Learning Potentials through Transfer Learning across Chemical Elements [0.0]
Machine Learning Potentials (MLPs) can enable simulations of ab initio accuracy at orders of magnitude lower computational cost.
Here, we introduce transfer learning of potential energy surfaces between chemically similar elements.
We demonstrate that transfer learning surpasses traditional training from scratch in force prediction, leading to more stable simulations and improved temperature transferability.
arXiv Detail & Related papers (2025-02-19T08:20:54Z) - What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - Physics-Informed Weakly Supervised Learning for Interatomic Potentials [17.165117198519248]
We introduce a physics-informed, weakly supervised approach for training machine-learned interatomic potentials.
We demonstrate reduced energy and force errors -- often lower by a factor of two -- for various baseline models and benchmark data sets.
arXiv Detail & Related papers (2024-07-23T12:49:04Z) - Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction [53.88231294380083]
We introduce a novel Multi-Epoch learning with Data Augmentation (MEDA) framework, suitable for both non-continual and continual learning scenarios.
MEDA minimizes overfitting by reducing the dependency of the embedding layer on subsequent training data.
Our findings confirm that pre-trained layers can adapt to new embedding spaces, enhancing performance without overfitting.
arXiv Detail & Related papers (2024-06-27T04:00:15Z) - On the Interplay of Subset Selection and Informed Graph Neural Networks [3.091456764812509]
This work focuses on predicting the molecules atomization energy in the QM9 dataset.
We show how maximizing molecular diversity in the training set selection process increases the robustness of linear and nonlinear regression techniques.
We also check the reliability of the predictions made by the graph neural network with a model-agnostic explainer.
arXiv Detail & Related papers (2023-06-15T09:09:27Z) - Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM)
The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory.
PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z) - Self-learning locally-optimal hypertuning using maximum entropy, and
comparison of machine learning approaches for estimating fatigue life in
composite materials [0.0]
We develop an ML nearest-neighbors-alike algorithm based on the principle of maximum entropy to predict fatigue damage.
The predictions achieve a good level of accuracy, similar to other ML algorithms.
arXiv Detail & Related papers (2022-10-19T12:20:07Z) - Knowledge Inheritance for Pre-trained Language Models [57.51305807391381]
We introduce a novel pre-training framework named "knowledge inheritance" (KI)
KI combines both self-learning and teacher-guided learning to efficiently train larger PLMs.
We show that KI can well support lifelong learning and knowledge transfer.
arXiv Detail & Related papers (2021-05-28T14:43:26Z) - Automated discovery of a robust interatomic potential for aluminum [4.6028828826414925]
Machine learning (ML) based potentials aim for faithful emulation of quantum mechanics (QM) calculations at drastically reduced computational cost.
We present a highly automated approach to dataset construction using the principles of active learning (AL)
We demonstrate this approach by building an ML potential for aluminum (ANI-Al)
To demonstrate transferability, we perform a 1.3M atom shock simulation, and show that ANI-Al predictions agree very well with DFT calculations on local atomic environments sampled from the nonequilibrium dynamics.
arXiv Detail & Related papers (2020-03-10T19:06:32Z) - Multilinear Compressive Learning with Prior Knowledge [106.12874293597754]
Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system.
Key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task.
In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable?
arXiv Detail & Related papers (2020-02-17T19:06:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.