MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution
Shifts and Training Conflicts
- URL: http://arxiv.org/abs/2202.06523v1
- Date: Mon, 14 Feb 2022 07:40:03 GMT
- Title: MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution
Shifts and Training Conflicts
- Authors: Weixin Liang and James Zou
- Abstract summary: We present MetaShift, a collection of 12,868 sets of natural images across 410 classes.
It provides explicit explanations of what is unique about each of its data sets and a distance score that measures the amount of distribution shift between any two of its data sets.
We show how MetaShift can help to visualize conflicts between data subsets during model training.
- Score: 20.09404891618634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the performance of machine learning models across diverse data
distributions is critically important for reliable applications. Motivated by
this, there is a growing focus on curating benchmark datasets that capture
distribution shifts. While valuable, the existing benchmarks are limited in
that many of them only contain a small number of shifts and they lack
systematic annotation about what is different across different shifts. We
present MetaShift--a collection of 12,868 sets of natural images across 410
classes--to address this challenge. We leverage the natural heterogeneity of
Visual Genome and its annotations to construct MetaShift. The key construction
idea is to cluster images using its metadata, which provides context for each
image (e.g. "cats with cars" or "cats in bathroom") that represent distinct
data distributions. MetaShift has two important benefits: first, it contains
orders of magnitude more natural data shifts than previously available. Second,
it provides explicit explanations of what is unique about each of its data sets
and a distance score that measures the amount of distribution shift between any
two of its data sets. We demonstrate the utility of MetaShift in benchmarking
several recent proposals for training models to be robust to data shifts. We
find that the simple empirical risk minimization performs the best when shifts
are moderate and no method had a systematic advantage for large shifts. We also
show how MetaShift can help to visualize conflicts between data subsets during
model training.
Related papers
- Benchmarking Distribution Shift in Tabular Data with TableShift [32.071534049494076]
TableShift is a distribution shift benchmark for tabular data.
It covers domains including finance, education, public policy, healthcare, and civic participation.
We conduct a large-scale study comparing several state-of-the-art data models alongside robust learning and domain generalization methods.
arXiv Detail & Related papers (2023-12-10T18:19:07Z) - Diversity-Aware Meta Visual Prompting [111.75306320834629]
We present Diversity-Aware Meta Visual Prompting(DAM-VP), an efficient prompting method for transferring pre-trained models to downstream tasks with frozen backbone.
We cluster the downstream dataset into small subsets in a diversity-strapped way, with each subset has its own prompt separately.
All the prompts are optimized with a meta-prompt, which is learned across several datasets.
arXiv Detail & Related papers (2023-03-14T17:59:59Z) - Dataset Interfaces: Diagnosing Model Failures Using Controllable
Counterfactual Generation [85.13934713535527]
Distribution shift is a major source of failure for machine learning models.
We introduce the notion of a dataset interface: a framework that, given an input dataset and a user-specified shift, returns instances that exhibit the desired shift.
We demonstrate how applying this dataset interface to the ImageNet dataset enables studying model behavior across a diverse array of distribution shifts.
arXiv Detail & Related papers (2023-02-15T18:56:26Z) - Estimating and Explaining Model Performance When Both Covariates and
Labels Shift [36.94826820536239]
We propose a new distribution shift model, Sparse Joint Shift (SJS), which considers the joint shift of both labels and a few features.
We also propose SEES, an algorithmic framework to characterize the distribution shift under SJS and to estimate a model's performance on new data without any labels.
arXiv Detail & Related papers (2022-09-18T01:16:16Z) - AnoShift: A Distribution Shift Benchmark for Unsupervised Anomaly
Detection [7.829710051617368]
We introduce an unsupervised anomaly detection benchmark with data that shifts over time, built over Kyoto-2006+, a traffic dataset for network intrusion detection.
We first highlight the non-stationary nature of the data, using a basic per-feature analysis, t-SNE, and an Optimal Transport approach for measuring the overall distribution distances between years.
We validate the performance degradation over time with diverse models, ranging from classical approaches to deep learning.
arXiv Detail & Related papers (2022-06-30T17:59:22Z) - A unified framework for dataset shift diagnostics [2.449909275410288]
Supervised learning techniques typically assume training data originates from the target population.
Yet, dataset shift frequently arises, which, if not adequately taken into account, may decrease the performance of their predictors.
We propose a novel and flexible framework called DetectShift that quantifies and tests for multiple dataset shifts.
arXiv Detail & Related papers (2022-05-17T13:34:45Z) - Deep learning model solves change point detection for multiple change
types [69.77452691994712]
A change points detection aims to catch an abrupt disorder in data distribution.
We propose an approach that works in the multiple-distributions scenario.
arXiv Detail & Related papers (2022-04-15T09:44:21Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - How to distribute data across tasks for meta-learning? [59.608652082495624]
We show that the optimal number of data points per task depends on the budget, but it converges to a unique constant value for large budgets.
Our results suggest a simple and efficient procedure for data collection.
arXiv Detail & Related papers (2021-03-15T15:38:47Z) - Combat Data Shift in Few-shot Learning with Knowledge Graph [42.59886121530736]
In real-world applications, few-shot learning paradigm often suffers from data shift.
Most existing few-shot learning approaches are not designed with the consideration of data shift.
We propose a novel metric-based meta-learning framework to extract task-specific representations and task-shared representations.
arXiv Detail & Related papers (2021-01-27T12:35:18Z) - WILDS: A Benchmark of in-the-Wild Distribution Shifts [157.53410583509924]
Distribution shifts can substantially degrade the accuracy of machine learning systems deployed in the wild.
We present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts.
We show that standard training results in substantially lower out-of-distribution than in-distribution performance.
arXiv Detail & Related papers (2020-12-14T11:14:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.