Interpretable Generalized Additive Models for Datasets with Missing Values
- URL: http://arxiv.org/abs/2412.02646v2
- Date: Thu, 12 Dec 2024 22:15:24 GMT
- Title: Interpretable Generalized Additive Models for Datasets with Missing Values
- Authors: Hayden McTavish, Jon Donnelly, Margo Seltzer, Cynthia Rudin,
- Abstract summary: M-GAM is a sparse, generalized, additive modeling approach that incorporates missingness indicators and their interaction terms.
We show that M-GAM provides similar or superior accuracy to prior methods while significantly improving sparsity relative to either imputation or naive inclusion of indicator variables.
- Score: 17.123919441593152
- License:
- Abstract: Many important datasets contain samples that are missing one or more feature values. Maintaining the interpretability of machine learning models in the presence of such missing data is challenging. Singly or multiply imputing missing values complicates the model's mapping from features to labels. On the other hand, reasoning on indicator variables that represent missingness introduces a potentially large number of additional terms, sacrificing sparsity. We solve these problems with M-GAM, a sparse, generalized, additive modeling approach that incorporates missingness indicators and their interaction terms while maintaining sparsity through l0 regularization. We show that M-GAM provides similar or superior accuracy to prior methods while significantly improving sparsity relative to either imputation or naive inclusion of indicator variables.
Related papers
- Joint Models for Handling Non-Ignorable Missing Data using Bayesian Additive Regression Trees: Application to Leaf Photosynthetic Traits Data [0.0]
Dealing with missing data poses significant challenges in predictive analysis.
In cases where the data are missing not at random, jointly modeling the data and missing data indicators is essential.
We propose two methods under a selection model framework for handling data with missingness.
arXiv Detail & Related papers (2024-12-19T15:26:55Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - Towards Better Modeling with Missing Data: A Contrastive Learning-based
Visual Analytics Perspective [7.577040836988683]
Missing data can pose a challenge for machine learning (ML) modeling.
Current approaches are categorized into feature imputation and label prediction.
This study proposes a Contrastive Learning framework to model observed data with missing values.
arXiv Detail & Related papers (2023-09-18T13:16:24Z) - Curve Your Enthusiasm: Concurvity Regularization in Differentiable
Generalized Additive Models [5.519653885553456]
Generalized Additive Models (GAMs) have recently experienced a resurgence in popularity due to their interpretability.
We show how concurvity can severly impair the interpretability of GAMs.
We propose a remedy: a conceptually simple, yet effective regularizer which penalizes pairwise correlations of the non-linearly transformed feature variables.
arXiv Detail & Related papers (2023-05-19T06:55:49Z) - The Missing Indicator Method: From Low to High Dimensions [16.899237833310064]
Missing data is common in applied data science, particularly in healthcare, social sciences, and natural sciences.
For data sets with informative missing patterns, the Missing Indicator Method (MIM) can be used in conjunction with imputation to improve model performance.
We show experimentally that MIM improves performance for informative missing values, and we prove that MIM does not hurt linear models for uninformative missing values.
We introduce Selective MIM, a method that adds missing indicators only for features that have informative missing patterns.
arXiv Detail & Related papers (2022-11-16T23:10:45Z) - Correlation Information Bottleneck: Towards Adapting Pretrained
Multimodal Models for Robust Visual Question Answering [63.87200781247364]
Correlation Information Bottleneck (CIB) seeks a tradeoff between compression and redundancy in representations.
We derive a tight theoretical upper bound for the mutual information between multimodal inputs and representations.
arXiv Detail & Related papers (2022-09-14T22:04:10Z) - Generative Modeling Helps Weak Supervision (and Vice Versa) [87.62271390571837]
We propose a model fusing weak supervision and generative adversarial networks.
It captures discrete variables in the data alongside the weak supervision derived label estimate.
It is the first approach to enable data augmentation through weakly supervised synthetic images and pseudolabels.
arXiv Detail & Related papers (2022-03-22T20:24:21Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - NeuMiss networks: differentiable programming for supervised learning
with missing values [0.0]
We derive the analytical form of the optimal predictor under a linearity assumption.
We propose a new principled architecture, named NeuMiss networks.
They have good predictive accuracy with both a number of parameters and a computational complexity independent of the number of missing data patterns.
arXiv Detail & Related papers (2020-07-03T11:42:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.