Predicting is not Understanding: Recognizing and Addressing
Underspecification in Machine Learning
- URL: http://arxiv.org/abs/2207.02598v1
- Date: Wed, 6 Jul 2022 11:20:40 GMT
- Title: Predicting is not Understanding: Recognizing and Addressing
Underspecification in Machine Learning
- Authors: Damien Teney, Maxime Peyrard, Ehsan Abbasnejad
- Abstract summary: Underspecification refers to the existence of multiple models that are indistinguishable in their in-domain accuracy.
We formalize the concept of underspecification and propose a method to identify and partially address it.
- Score: 47.651130958272155
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) models are typically optimized for their accuracy on a
given dataset. However, this predictive criterion rarely captures all desirable
properties of a model, in particular how well it matches a domain expert's
understanding of a task. Underspecification refers to the existence of multiple
models that are indistinguishable in their in-domain accuracy, even though they
differ in other desirable properties such as out-of-distribution (OOD)
performance. Identifying these situations is critical for assessing the
reliability of ML models.
We formalize the concept of underspecification and propose a method to
identify and partially address it. We train multiple models with an
independence constraint that forces them to implement different functions. They
discover predictive features that are otherwise ignored by standard empirical
risk minimization (ERM), which we then distill into a global model with
superior OOD performance. Importantly, we constrain the models to align with
the data manifold to ensure that they discover meaningful features. We
demonstrate the method on multiple datasets in computer vision (collages,
WILDS-Camelyon17, GQA) and discuss general implications of underspecification.
Most notably, in-domain performance cannot serve for OOD model selection
without additional assumptions.
Related papers
- Increasing Performance And Sample Efficiency With Model-agnostic
Interactive Feature Attributions [3.0655581300025996]
We provide model-agnostic implementations for two popular explanation methods (Occlusion and Shapley values) to enforce entirely different attributions in the complex model.
We show how our proposed approach can significantly improve the model's performance only by augmenting its training dataset based on corrected explanations.
arXiv Detail & Related papers (2023-06-28T15:23:28Z) - A prediction and behavioural analysis of machine learning methods for
modelling travel mode choice [0.26249027950824505]
We conduct a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice.
Results indicate that the models with the highest disaggregate predictive performance provide poorer estimates of behavioural indicators and aggregate mode shares.
It is also observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.
arXiv Detail & Related papers (2023-01-11T11:10:32Z) - Measuring the Driving Forces of Predictive Performance: Application to
Credit Scoring [0.0]
In credit scoring, machine learning models are known to outperform standard parametric models.
We introduce the XPER methodology to decompose a performance metric into contributions associated with a model.
We show that a small number of features can explain a surprisingly large part of the model performance.
arXiv Detail & Related papers (2022-12-12T13:09:46Z) - Assessing Out-of-Domain Language Model Performance from Few Examples [38.245449474937914]
We address the task of predicting out-of-domain (OOD) performance in a few-shot fashion.
We benchmark the performance on this task when looking at model accuracy on the few-shot examples.
We show that attribution-based factors can help rank relative model OOD performance.
arXiv Detail & Related papers (2022-10-13T04:45:26Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - VisFIS: Visual Feature Importance Supervision with
Right-for-the-Right-Reason Objectives [84.48039784446166]
We show that model FI supervision can meaningfully improve VQA model accuracy as well as performance on several Right-for-the-Right-Reason metrics.
Our best performing method, Visual Feature Importance Supervision (VisFIS), outperforms strong baselines on benchmark VQA datasets.
Predictions are more accurate when explanations are plausible and faithful, and not when they are plausible but not faithful.
arXiv Detail & Related papers (2022-06-22T17:02:01Z) - Sharing pattern submodels for prediction with missing values [12.981974894538668]
Missing values are unavoidable in many applications of machine learning and present challenges both during training and at test time.
We propose an alternative approach, called sharing pattern submodels, which i) makes predictions robust to missing values at test time, ii) maintains or improves the predictive power of pattern submodels andiii) has a short description, enabling improved interpretability.
arXiv Detail & Related papers (2022-06-22T15:09:40Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.