All-in-one foundational models learning across quantum chemical levels
- URL: http://arxiv.org/abs/2409.12015v1
- Date: Wed, 18 Sep 2024 14:29:14 GMT
- Title: All-in-one foundational models learning across quantum chemical levels
- Authors: Yuxinxin Chen, Pavlo O. Dral,
- Abstract summary: We introduce the all-in-one (AIO) ANI model architecture based on multimodal learning.
Our all-in-one learning approach offers a more general and easier-to-use alternative to transfer learning.
We show that the AIO-ANI model can learn across different QC levels ranging from semi-empirical to density functional theory to coupled cluster.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Machine learning (ML) potentials typically target a single quantum chemical (QC) level while the ML models developed for multi-fidelity learning have not been shown to provide scalable solutions for foundational models. Here we introduce the all-in-one (AIO) ANI model architecture based on multimodal learning which can learn an arbitrary number of QC levels. Our all-in-one learning approach offers a more general and easier-to-use alternative to transfer learning. We use it to train the AIO-ANI-UIP foundational model with the generalization capability comparable to semi-empirical GFN2-xTB and DFT with a double-zeta basis set for organic molecules. We show that the AIO-ANI model can learn across different QC levels ranging from semi-empirical to density functional theory to coupled cluster. We also use AIO models to design the foundational model {\Delta}-AIO-ANI based on {\Delta}-learning with increased accuracy and robustness compared to AIO-ANI-UIP. The code and the foundational models are available at https://github.com/dralgroup/aio-ani; they will be integrated into the universal and updatable AI-enhanced QM (UAIQM) library and made available in the MLatom package so that they can be used online at the XACS cloud computing platform (see https://github.com/dralgroup/mlatom for updates).
Related papers
- xLAM: A Family of Large Action Models to Empower AI Agent Systems [111.5719694445345]
We release xLAM, a series of large action models designed for AI agent tasks.
xLAM consistently delivers exceptional performance across multiple agent ability benchmarks.
arXiv Detail & Related papers (2024-09-05T03:22:22Z) - FeNNol: an Efficient and Flexible Library for Building Force-field-enhanced Neural Network Potentials [0.0]
We present FeNNol, a new library for building, training and running force-field-enhanced neural network potentials.
It provides a flexible and modular system for building hybrid models.
It is demonstrated with the popular ANI-2x model reaching simulation speeds nearly on par with the AMOEBA polarizable force-field.
arXiv Detail & Related papers (2024-05-02T17:25:32Z) - Enhancing One-Shot Federated Learning Through Data and Ensemble
Co-Boosting [76.64235084279292]
One-shot Federated Learning (OFL) has become a promising learning paradigm, enabling the training of a global server model via a single communication round.
We introduce a novel framework, Co-Boosting, in which synthesized data and the ensemble model mutually enhance each other progressively.
arXiv Detail & Related papers (2024-02-23T03:15:10Z) - ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models [51.35570730554632]
ESPnet-SPK is a toolkit for training speaker embedding extractors.
We provide several models, ranging from x-vector to recent SKA-TDNN.
We also aspire to bridge developed models with other domains.
arXiv Detail & Related papers (2024-01-30T18:18:27Z) - Agglomerative Federated Learning: Empowering Larger Model Training via End-Edge-Cloud Collaboration [10.90126132493769]
Agglomerative Federated Learning (FedAgg) is a novel EECC-empowered FL framework that allows the trained models from end, edge, to cloud to grow larger in size and stronger in generalization ability.
FedAgg outperforms state-of-the-art methods by an average of 4.53% accuracy gains and remarkable improvements in convergence rate.
arXiv Detail & Related papers (2023-12-01T06:18:45Z) - Model Share AI: An Integrated Toolkit for Collaborative Machine Learning
Model Development, Provenance Tracking, and Deployment in Python [0.0]
We introduce Model Share AI (AIMS), an easy-to-use MLOps platform designed to streamline collaborative model development, model provenance tracking, and model deployment.
AIMS features collaborative project spaces and a standardized model evaluation process that ranks model submissions based on their performance on unseen evaluation data.
AIMS allows users to deploy ML models built in Scikit-Learn, Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps.
arXiv Detail & Related papers (2023-09-27T15:24:39Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - TaCA: Upgrading Your Visual Foundation Model with Task-agnostic
Compatible Adapter [21.41170708560114]
A growing number of applications based on visual foundation models are emerging.
In situations involving system upgrades, it becomes essential to re-train all downstream modules to adapt to the new foundation model.
We introduce a parameter-efficient and task-agnostic adapter, dubbed TaCA, that facilitates compatibility across distinct foundation models.
arXiv Detail & Related papers (2023-06-22T03:00:24Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Smaller World Models for Reinforcement Learning [0.5156484100374059]
We propose a new neural network architecture for world models based on a vector quantized-variational autoencoder (VQ-VAE)
A model-free PPO agent is trained purely on simulated experience from the world model.
We show that we reach comparable performance to their SimPLe algorithm, while our model is significantly smaller.
arXiv Detail & Related papers (2020-10-12T15:02:41Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.