Still More Shades of Null: An Evaluation Suite for Responsible Missing Value Imputation
- URL: http://arxiv.org/abs/2409.07510v4
- Date: Wed, 05 Feb 2025 00:42:46 GMT
- Title: Still More Shades of Null: An Evaluation Suite for Responsible Missing Value Imputation
- Authors: Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich,
- Abstract summary: We present Shades-of-NULL, an evaluation suite for responsible missing value imputation.
We use Shades-of-NULL to conduct a large-scale empirical study involving 29,736 experimental pipelines.
- Score: 7.620967781722717
- License:
- Abstract: Data missingness is a practical challenge of sustained interest to the scientific community. In this paper, we present Shades-of-NULL, an evaluation suite for responsible missing value imputation. Our work is novel in two ways (i) we model realistic and socially-salient missingness scenarios that go beyond Rubin's classic Missing Completely at Random (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR) settings, to include multi-mechanism missingness (when different missingness patterns co-exist in the data) and missingness shift (when the missingness mechanism changes between training and test) (ii) we evaluate imputers holistically, based on imputation quality and imputation fairness, as well as on the predictive performance, fairness and stability of the models that are trained and tested on the data post-imputation. We use Shades-of-NULL to conduct a large-scale empirical study involving 29,736 experimental pipelines, and find that while there is no single best-performing imputation approach for all missingness types, interesting trade-offs arise between predictive performance, fairness and stability, based on the combination of missingness scenario, imputer choice, and the architecture of the predictive model. We make Shades-of-NULL publicly available, to enable researchers to rigorously evaluate missing value imputation methods on a wide range of metrics in plausible and socially meaningful scenarios.
Related papers
- A Probabilistic Perspective on Unlearning and Alignment for Large Language Models [48.96686419141881]
We introduce the first formal probabilistic evaluation framework for Large Language Models (LLMs)
Namely, we propose novel metrics with high probability guarantees concerning the output distribution of a model.
Our metrics are application-independent and allow practitioners to make more reliable estimates about model capabilities before deployment.
arXiv Detail & Related papers (2024-10-04T15:44:23Z) - Error-Driven Uncertainty Aware Training [7.702016079410588]
Error-Driven Uncertainty Aware Training aims to enhance the ability of neural classifiers to estimate their uncertainty correctly.
The EUAT approach operates during the model's training phase by selectively employing two loss functions depending on whether the training examples are correctly or incorrectly predicted.
We evaluate EUAT using diverse neural models and datasets in the image recognition domains considering both non-adversarial and adversarial settings.
arXiv Detail & Related papers (2024-05-02T11:48:14Z) - Evaluating AI systems under uncertain ground truth: a case study in
dermatology [44.80772162289557]
We propose a metric for measuring annotation uncertainty and provide uncertainty-adjusted metrics for performance evaluation.
We present a case study applying our framework to skin condition classification from images where annotations are provided in the form of differential diagnoses.
arXiv Detail & Related papers (2023-07-05T10:33:45Z) - Toward Reliable Human Pose Forecasting with Uncertainty [51.628234388046195]
We develop an open-source library for human pose forecasting, including multiple models, supporting several datasets.
We devise two types of uncertainty in the problem to increase performance and convey better trust.
arXiv Detail & Related papers (2023-04-13T17:56:08Z) - How Reliable is Your Regression Model's Uncertainty Under Real-World
Distribution Shifts? [46.05502630457458]
We propose a benchmark of 8 image-based regression datasets with different types of challenging distribution shifts.
We find that while methods are well calibrated when there is no distribution shift, they all become highly overconfident on many of the benchmark datasets.
arXiv Detail & Related papers (2023-02-07T18:54:39Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Sharing pattern submodels for prediction with missing values [12.981974894538668]
Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time.
We propose an alternative approach, called sharing pattern submodels, which i) makes predictions robust to missing values at test time, ii) maintains or improves the predictive power of pattern submodels andiii) has a short description, enabling improved interpretability.
arXiv Detail & Related papers (2022-06-22T15:09:40Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - An Uncertainty-based Human-in-the-loop System for Industrial Tool Wear
Analysis [68.8204255655161]
We show that uncertainty measures based on Monte-Carlo dropout in the context of a human-in-the-loop system increase the system's transparency and performance.
A simulation study demonstrates that the uncertainty-based human-in-the-loop system increases performance for different levels of human involvement.
arXiv Detail & Related papers (2020-07-14T15:47:37Z) - Uncertainty-Gated Stochastic Sequential Model for EHR Mortality
Prediction [6.170898159041278]
We present a novel variational recurrent network that estimates the distribution of missing variables, updates hidden states, and predicts the possibility of in-hospital mortality.
It is noteworthy that our model can conduct these procedures in a single stream and learn all network parameters jointly in an end-to-end manner.
arXiv Detail & Related papers (2020-03-02T04:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.