UMA: A Family of Universal Models for Atoms
- URL: http://arxiv.org/abs/2506.23971v1
- Date: Mon, 30 Jun 2025 15:38:13 GMT
- Title: UMA: A Family of Universal Models for Atoms
- Authors: Brandon M. Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Kareem Abdelmaqsoud, Vahe Gharakhanyan, John R. Kitchin, Daniel S. Levine, Kyle Michel, Anuroop Sriram, Taco Cohen, Abhishek Das, Ammar Rizvi, Sushree Jagriti Sahoo, Zachary W. Ulissi, C. Lawrence Zitnick,
- Abstract summary: We present a family of Universal Models for Atoms (UMA), designed to push the frontier of speed, accuracy, and generalization.<n>UMA models are trained on half a billion unique 3D atomic structures by compiling data across multiple chemical domains.<n>We evaluate UMA models on a diverse set of applications across multiple domains and find that, remarkably, a single model without any fine-tuning can perform similarly or better than specialized models.
- Score: 16.3404265902621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to quickly and accurately compute properties from atomic simulations is critical for advancing a large number of applications in chemistry and materials science including drug discovery, energy storage, and semiconductor manufacturing. To address this need, Meta FAIR presents a family of Universal Models for Atoms (UMA), designed to push the frontier of speed, accuracy, and generalization. UMA models are trained on half a billion unique 3D atomic structures (the largest training runs to date) by compiling data across multiple chemical domains, e.g. molecules, materials, and catalysts. We develop empirical scaling laws to help understand how to increase model capacity alongside dataset size to achieve the best accuracy. The UMA small and medium models utilize a novel architectural design we refer to as mixture of linear experts that enables increasing model capacity without sacrificing speed. For example, UMA-medium has 1.4B parameters but only ~50M active parameters per atomic structure. We evaluate UMA models on a diverse set of applications across multiple domains and find that, remarkably, a single model without any fine-tuning can perform similarly or better than specialized models. We are releasing the UMA code, weights, and associated data to accelerate computational workflows and enable the community to continue to build increasingly capable AI models.
Related papers
- MOFSimBench: Evaluating Universal Machine Learning Interatomic Potentials In Metal--Organic Framework Molecular Modeling [0.19506923346234722]
Universal machine learning interatomic potentials (uMLIPs) have emerged as powerful tools for accelerating atomistic simulations.<n>We introduce MOFSimBench, a benchmark to evaluate uMLIPs on key materials modeling tasks for nanoporous materials.<n>We find that top-performing uMLIPs consistently outperform classical force fields and fine-tuned machine learning potentials across all tasks.
arXiv Detail & Related papers (2025-07-16T00:00:55Z) - MatterTune: An Integrated, User-Friendly Platform for Fine-Tuning Atomistic Foundation Models to Accelerate Materials Simulation and Discovery [7.1240120153291535]
We introduce MatterTune, a framework that provides advanced fine-tuning capabilities and seamless integration of atomistic foundation models into downstream materials informatics and simulation.<n> MatterTune supports a number of state-of-the-art foundation models such as ORB, MatterSim, JMP, and EquformerV2.
arXiv Detail & Related papers (2025-04-14T19:12:43Z) - Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models? [68.72260770171212]
We propose a paradigm of Self-structured Chain of Thought (SCoT), which is composed of minimal semantic atomic steps.<n>Our method can not only generate cognitive CoT structures for various complex tasks but also mitigates the phenomenon of overthinking.<n>We conduct extensive experiments to show that the proposed AtomThink significantly improves the performance of baseline MLLMs.
arXiv Detail & Related papers (2025-03-08T15:23:47Z) - xLAM: A Family of Large Action Models to Empower AI Agent Systems [111.5719694445345]
We release xLAM, a series of large action models designed for AI agent tasks.
xLAM consistently delivers exceptional performance across multiple agent ability benchmarks.
arXiv Detail & Related papers (2024-09-05T03:22:22Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Quantum-informed simulations for mechanics of materials: DFTB+MBD framework [40.83978401377059]
We study how quantum effects can modify the mechanical properties of systems relevant to materials engineering.
We provide an open-source repository containing all codes, datasets, and examples presented in this work.
arXiv Detail & Related papers (2024-04-05T16:59:01Z) - Specializing Smaller Language Models towards Multi-Step Reasoning [56.78474185485288]
We show that abilities can be distilled down from GPT-3.5 ($ge$ 175B) to T5 variants ($le$ 11B)
We propose model specialization, to specialize the model's ability towards a target task.
arXiv Detail & Related papers (2023-01-30T08:51:19Z) - What Language Model to Train if You Have One Million GPU Hours? [54.32062236748831]
We study the impact of different modeling practices and their impact on zero-shot generalization.
We also study the performance of a multilingual model and how it compares to the English-only one.
All our models and code are open-sourced at https://huggingface.co/bigscience.
arXiv Detail & Related papers (2022-10-27T13:43:27Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - DPA-1: Pretraining of Attention-based Deep Potential Model for Molecular
Simulation [13.631315487331195]
We propose DPA-1, a Deep Potential model with a novel attention mechanism.
When pretrained on large-scale datasets containing 56 elements, DPA-1 can be successfully applied to various downstream tasks.
For different elements, the learned type embedding parameters form a $spiral$ in the latent space and have a natural correspondence with their positions on the periodic table.
arXiv Detail & Related papers (2022-08-17T11:33:46Z) - BIGDML: Towards Exact Machine Learning Force Fields for Materials [55.944221055171276]
Machine-learning force fields (MLFF) should be accurate, computationally and data efficient, and applicable to molecules, materials, and interfaces thereof.
Here, we introduce the Bravais-Inspired Gradient-Domain Machine Learning approach and demonstrate its ability to construct reliable force fields using a training set with just 10-200 atoms.
arXiv Detail & Related papers (2021-06-08T10:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.