Comparing the latent features of universal machine-learning interatomic potentials
- URL: http://arxiv.org/abs/2512.05717v1
- Date: Fri, 05 Dec 2025 13:45:01 GMT
- Title: Comparing the latent features of universal machine-learning interatomic potentials
- Authors: Sofiia Chorna, Davide Tisi, Cesare Malosso, Wei Bin How, Michele Ceriotti, Sanggyu Chong,
- Abstract summary: We show that machine-learning interatomic potentials (uMLIPs) encode chemical space in significantly distinct ways.<n>We discuss how atom-level features, which are directly output by MLIPs, can be compressed into global structure-level features.
- Score: 1.2314765641075438
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The past few years have seen the development of ``universal'' machine-learning interatomic potentials (uMLIPs) capable of approximating the ground-state potential energy surface across a wide range of chemical structures and compositions with reasonable accuracy. While these models differ in the architecture and the dataset used, they share the ability to compress a staggering amount of chemical information into descriptive latent features. Herein, we systematically analyze what the different uMLIPs have learned by quantitatively assessing the relative information content of their latent features with feature reconstruction errors as metrics, and observing how the trends are affected by the choice of training set and training protocol. We find that the uMLIPs encode chemical space in significantly distinct ways, with substantial cross-model feature reconstruction errors. When variants of the same model architecture are considered, trends become dependent on the dataset, target, and training protocol of choice. We also observe that fine-tuning of a uMLIP retains a strong pre-training bias in the latent features. Finally, we discuss how atom-level features, which are directly output by MLIPs, can be compressed into global structure-level features via concatenation of progressive cumulants, each adding significantly new information about the variability across the atomic environments within a given system.
Related papers
- Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - Foundation Models for Discovery and Exploration in Chemical Space [57.97784111110166]
MIST is a family of molecular foundation models trained on large unlabeled datasets.<n>We demonstrate the ability of these models to solve real-world problems across chemical space.
arXiv Detail & Related papers (2025-10-20T17:56:01Z) - Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms [55.1784306456972]
Mixture-of-Experts (MoE) architectures have emerged as a promising direction, offering efficiency and scalability by activating only a subset of parameters during inference.<n>We use an internal metric to investigate the mechanisms of MoE architecture by explicitly incorporating routing mechanisms and analyzing expert-level behaviors.<n>We uncover several findings: (1) neuron utilization decreases as models evolve, reflecting stronger generalization; (2) training exhibits a dynamic trajectory, where benchmark performance alone provides limited signal; (3) task completion emerges from collaborative contributions of multiple experts, with shared experts driving concentration; and (4) activation patterns at the neuron level provide a fine-grained proxy for data diversity.
arXiv Detail & Related papers (2025-09-28T15:13:38Z) - MOFSimBench: Evaluating Universal Machine Learning Interatomic Potentials In Metal--Organic Framework Molecular Modeling [0.19506923346234722]
Universal machine learning interatomic potentials (uMLIPs) have emerged as powerful tools for accelerating atomistic simulations.<n>We introduce MOFSimBench, a benchmark to evaluate uMLIPs on key materials modeling tasks for nanoporous materials.<n>We find that top-performing uMLIPs consistently outperform classical force fields and fine-tuned machine learning potentials across all tasks.
arXiv Detail & Related papers (2025-07-16T00:00:55Z) - Equivariant Machine Learning Interatomic Potentials with Global Charge Redistribution [1.6112718683989882]
Machine learning interatomic potentials (MLIPs) provide a computationally efficient alternative to quantum mechanical simulations for predicting material properties.<n>We develop a new equivariant MLIP incorporating long-range Coulomb interactions through explicit treatment of electronic degrees of freedom.<n>We systematically evaluate our model across a range of benchmark periodic and non-periodic datasets, demonstrating that it outperforms both short-range equivariant and long-range invariant MLIPs in energy and force predictions.
arXiv Detail & Related papers (2025-03-23T05:26:55Z) - Pretraining Graph Transformers with Atom-in-a-Molecule Quantum Properties for Improved ADMET Modeling [38.53065398127086]
We evaluate the impact of pretraining Graph Transformer architectures on atom-level quantum-mechanical features.
We find that models pretrained on atomic quantum mechanical properties capture more low-frequency laplacian eigenmodes.
arXiv Detail & Related papers (2024-10-10T15:20:30Z) - Benchmark on Drug Target Interaction Modeling from a Structure Perspective [48.60648369785105]
Drug-target interaction prediction is crucial to drug discovery and design.
Recent methods, such as those based on graph neural networks (GNNs) and Transformers, demonstrate exceptional performance across various datasets.
We conduct a comprehensive survey and benchmark for drug-target interaction modeling from a structure perspective, via integrating tens of explicit (i.e., GNN-based) and implicit (i.e., Transformer-based) structure learning algorithms.
arXiv Detail & Related papers (2024-07-04T16:56:59Z) - Interpolation and differentiation of alchemical degrees of freedom in machine learning interatomic potentials [0.980222898148295]
We report the use of continuous and differentiable alchemical degrees of freedom in atomistic materials simulations.<n>The proposed method introduces alchemical atoms with corresponding weights into the input graph, alongside modifications to the message-passing and readout mechanisms of MLIPs.<n>The end-to-end differentiability of MLIPs enables efficient calculation of the gradient of energy with respect to the compositional weights.
arXiv Detail & Related papers (2024-04-16T17:24:22Z) - Role of Structural and Conformational Diversity for Machine Learning
Potentials [4.608732256350959]
We investigate the relationship between data biases and model generalization in Quantum Mechanics.
Our results reveal nuanced patterns in generalization metrics.
These findings provide valuable insights and guidelines for QM data generation efforts.
arXiv Detail & Related papers (2023-10-30T19:33:12Z) - Learning Multiscale Consistency for Self-supervised Electron Microscopy
Instance Segmentation [48.267001230607306]
We propose a pretraining framework that enhances multiscale consistency in EM volumes.
Our approach leverages a Siamese network architecture, integrating strong and weak data augmentations.
It effectively captures voxel and feature consistency, showing promise for learning transferable representations for EM analysis.
arXiv Detail & Related papers (2023-08-19T05:49:13Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Graph Neural Network for Hamiltonian-Based Material Property Prediction [56.94118357003096]
We present and compare several different graph convolution networks that are able to predict the band gap for inorganic materials.
The models are developed to incorporate two different features: the information of each orbital itself and the interaction between each other.
The results show that our model can get a promising prediction accuracy with cross-validation.
arXiv Detail & Related papers (2020-05-27T13:32:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.