Related papers: Determining Domain of Machine Learning Models using Kernel Density Estimates: Applications in Materials Property Prediction

Determining Domain of Machine Learning Models using Kernel Density Estimates: Applications in Materials Property Prediction

URL: http://arxiv.org/abs/2406.05143v1
Date: Tue, 28 May 2024 15:41:16 GMT
Title: Determining Domain of Machine Learning Models using Kernel Density Estimates: Applications in Materials Property Prediction
Authors: Lane E. Schultz, Yiqi Wang, Ryan Jacobs, Dane Morgan,
Abstract summary: We develop a new approach of assessing model domain using kernel density estimation. We show that chemical groups considered unrelated based on established chemical knowledge exhibit significant dissimilarities by our measure. High measures of dissimilarity are associated with poor model performance and poor estimates of model uncertainty.
Score: 1.8551396341435895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge of the domain of applicability of a machine learning model is essential to ensuring accurate and reliable model predictions. In this work, we develop a new approach of assessing model domain and demonstrate that our approach provides accurate and meaningful designation of in-domain versus out-of-domain when applied across multiple model types and material property data sets. Our approach assesses the distance between a test and training data point in feature space by using kernel density estimation and shows that this distance provides an effective tool for domain determination. We show that chemical groups considered unrelated based on established chemical knowledge exhibit significant dissimilarities by our measure. We also show that high measures of dissimilarity are associated with poor model performance (i.e., high residual magnitudes) and poor estimates of model uncertainty (i.e., unreliable uncertainty estimation). Automated tools are provided to enable researchers to establish acceptable dissimilarity thresholds to identify whether new predictions of their own machine learning models are in-domain versus out-of-domain.

Related papers

QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions [12.851704083461616]
We present QuEst, a principled framework to merge observed and imputed data to deliver point estimates.<n> QuEst covers a range of measures, from tail risk (CVaR) to population segments such as quartiles, that are central to fields such as economics, sociology, education, medicine, and more.<n>We extend QuEst to multidimensional metrics, and introduce an additional optimization technique to further reduce variance in this and other hybrid estimators.
arXiv Detail & Related papers (2025-07-07T17:33:18Z)
Comparative Evaluation of Applicability Domain Definition Methods for Regression Models [0.0]
Applicability domain refers to the range of data for which the prediction of a predictive model is expected to be reliable and accurate. We propose a novel approach based on non-deterministic Bayesian neural networks to define the applicability domain of the model.
arXiv Detail & Related papers (2024-11-01T14:12:57Z)
Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance. Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z)
Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification. We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z)
Quantifying Distribution Shifts and Uncertainties for Enhanced Model Robustness in Machine Learning Applications [0.0]
This study explores model adaptation and generalization by utilizing synthetic data. We employ quantitative measures such as Kullback-Leibler divergence, Jensen-Shannon distance, and Mahalanobis distance to assess data similarity. Our findings suggest that utilizing statistical measures, such as the Mahalanobis distance, to determine whether model predictions fall within the low-error "interpolation regime" or the high-error "extrapolation regime" provides a complementary method for assessing distribution shift and model uncertainty.
arXiv Detail & Related papers (2024-05-03T10:05:31Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
Out of Distribution Detection via Domain-Informed Gaussian Process State Space Models [22.24457254575906]
In order for robots to safely navigate in unseen scenarios, it is important to accurately detect out-of-training-distribution (OoD) situations online. We propose a novel approach to embed existing domain knowledge in the kernel and (ii) an OoD online runtime monitor, based on receding-horizon predictions.
arXiv Detail & Related papers (2023-09-13T01:02:42Z)
Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation [9.359714425373616]
Empirical risk often performs poorly when the distribution of the target domain differs from those of source domains. We develop an unsupervised domain adaptation approach that leverages labeled data from multiple source domains and unlabeled data from the target domain.
arXiv Detail & Related papers (2023-09-05T13:19:40Z)
Evaluating Explainability in Machine Learning Predictions through Explainer-Agnostic Metrics [0.0]
We develop six distinct model-agnostic metrics designed to quantify the extent to which model predictions can be explained. These metrics measure different aspects of model explainability, ranging from local importance, global importance, and surrogate predictions. We demonstrate the practical utility of these metrics on classification and regression tasks, and integrate these metrics into an existing Python package for public use.
arXiv Detail & Related papers (2023-02-23T15:28:36Z)
Outlier-Based Domain of Applicability Identification for Materials Property Prediction Models [0.38073142980733]
We propose a method to find domains of applicability using a large feature space and also introduce analysis techniques to gain more insight into the detected domains. In this work, we propose a method to find domains of applicability using a large feature space and also introduce analysis techniques to gain more insight into the detected domains.
arXiv Detail & Related papers (2023-01-17T07:51:12Z)
MAUVE Scores for Generative Models: Theory and Practice [95.86006777961182]
We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. We find that MAUVE can quantify the gaps between the distributions of human-written text and those of modern neural language models. We demonstrate in the vision domain that MAUVE can identify known properties of generated images on par with or better than existing metrics.
arXiv Detail & Related papers (2022-12-30T07:37:40Z)
Exploring validation metrics for offline model-based optimisation with diffusion models [50.404829846182764]
In model-based optimisation (MBO) we are interested in using machine learning to design candidates that maximise some measure of reward with respect to a black box function called the (ground truth) oracle. While an approximation to the ground oracle can be trained and used in place of it during model validation to measure the mean reward over generated candidates, the evaluation is approximate and vulnerable to adversarial examples. This is encapsulated under our proposed evaluation framework which is also designed to measure extrapolation.
arXiv Detail & Related papers (2022-11-19T16:57:37Z)
Firenze: Model Evaluation Using Weak Signals [5.723905680436377]
We introduce Firenze, a novel framework for comparative evaluation of machine learning models' performance. We show that markers computed and combined over select subsets of samples called regions of interest can provide a robust estimate of their real-world performances.
arXiv Detail & Related papers (2022-07-02T13:20:38Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Spatial machine-learning model diagnostics: a model-agnostic distance-based approach [91.62936410696409]
This contribution proposes spatial prediction error profiles (SPEPs) and spatial variable importance profiles (SVIPs) as novel model-agnostic assessment and interpretation tools. The SPEPs and SVIPs of geostatistical methods, linear models, random forest, and hybrid algorithms show striking differences and also relevant similarities. The novel diagnostic tools enrich the toolkit of spatial data science, and may improve ML model interpretation, selection, and design.
arXiv Detail & Related papers (2021-11-13T01:50:36Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.