Benchmarking GNNs for OOD Materials Property Prediction with Uncertainty Quantification
- URL: http://arxiv.org/abs/2511.11697v1
- Date: Wed, 12 Nov 2025 11:15:24 GMT
- Title: Benchmarking GNNs for OOD Materials Property Prediction with Uncertainty Quantification
- Authors: Liqin Tan, Pin Chen, Menghan Liu, Xiean Wang, Jianhuan Cen, Qingsong Zou,
- Abstract summary: MatUQ is a benchmark framework for evaluating graph neural networks (GNNs) on out-of-distribution (OOD) materials property prediction.<n>MatUQ comprises 1,375 OOD prediction tasks constructed from six materials datasets using five OFM-based and a newly proposed structure-aware splitting strategy.
- Score: 4.2576347479313394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present MatUQ, a benchmark framework for evaluating graph neural networks (GNNs) on out-of-distribution (OOD) materials property prediction with uncertainty quantification (UQ). MatUQ comprises 1,375 OOD prediction tasks constructed from six materials datasets using five OFM-based and a newly proposed structure-aware splitting strategy, SOAP-LOCO, which captures local atomic environments more effectively. We evaluate 12 representative GNN models under a unified uncertainty-aware training protocol that combines Monte Carlo Dropout and Deep Evidential Regression (DER), and introduce a novel uncertainty metric, D-EviU, which shows the strongest correlation with prediction errors in most tasks. Our experiments yield two key findings. First, the uncertainty-aware training approach significantly improves model prediction accuracy, reducing errors by an average of 70.6\% across challenging OOD scenarios. Second, the benchmark reveals that no single model dominates universally: earlier models such as SchNet and ALIGNN remain competitive, while newer models like CrystalFramer and SODNet demonstrate superior performance on specific material properties. These results provide practical insights for selecting reliable models under distribution shifts in materials discovery.
Related papers
- MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts [25.643876327918544]
Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution samples.
Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias.
We propose MaNo which applies a data-dependent normalization on the logits to reduce prediction bias and takes the $L_p$ norm of the matrix of normalized logits as the estimation score.
arXiv Detail & Related papers (2024-05-29T10:45:06Z) - Structure-based out-of-distribution (OOD) materials property prediction:
a benchmark study [1.3711992220025948]
We present a benchmark study of structure-based graph neural networks (GNNs) for extrapolative OOD materials property prediction.
Our experiments show that current state-of-the-art GNN algorithms significantly underperform for the OOD property prediction tasks.
We identify the sources of CGCNN, ALIGNN, and DeeperGATGNN's significantly more robust OOD performance than those of the current best models.
arXiv Detail & Related papers (2024-01-16T01:03:39Z) - Uncertainty Quantification in Multivariable Regression for Material Property Prediction with Bayesian Neural Networks [37.69303106863453]
We introduce an approach for uncertainty quantification (UQ) within physics-informed BNNs.
We present case studies for predicting the creep rupture life of steel alloys.
The most promising framework for creep life prediction is BNNs based on Markov Chain Monte Carlo approximation of the posterior distribution of network parameters.
arXiv Detail & Related papers (2023-11-04T19:40:16Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Uncertainty Quantification for Molecular Property Predictions with Graph Neural Architecture Search [2.711812013460678]
We introduce AutoGNNUQ, an automated uncertainty quantification (UQ) approach for molecular property prediction.
Our approach employs variance decomposition to separate data (aleatoric) and model (epistemic) uncertainties, providing valuable insights for reducing them.
AutoGNNUQ has broad applicability in domains such as drug discovery and materials science, where accurate uncertainty quantification is crucial for decision-making.
arXiv Detail & Related papers (2023-07-19T20:03:42Z) - Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis,
and LLMs Evaluations [111.88727295707454]
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP.
We propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts.
We conduct experiments on pre-trained language models for analysis and evaluation of OOD robustness.
arXiv Detail & Related papers (2023-06-07T17:47:03Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.