Related papers: An Information-theoretic Approach to Distribution Shifts

An Information-theoretic Approach to Distribution Shifts

URL: http://arxiv.org/abs/2106.03783v1
Date: Mon, 7 Jun 2021 16:44:21 GMT
Title: An Information-theoretic Approach to Distribution Shifts
Authors: Marco Federici, Ryota Tomioka, Patrick Forr\'e
Abstract summary: Safely deploying machine learning models to the real world is often a challenging process. Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere. neural networks that are fit to a subset of the population might carry some selection bias into their decision process.
Score: 9.475039534437332
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Safely deploying machine learning models to the real world is often a challenging process. Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere, agents trained in a simulation can struggle to adapt when deployed in the real world or novel environments, and neural networks that are fit to a subset of the population might carry some selection bias into their decision process. In this work, we describe the problem of data shift from a novel information-theoretic perspective by (i) identifying and describing the different sources of error, (ii) comparing some of the most promising objectives explored in the recent domain generalization, and fair classification literature. From our theoretical analysis and empirical evaluation, we conclude that the model selection procedure needs to be guided by careful considerations regarding the observed data, the factors used for correction, and the structure of the data-generating process.

Related papers

Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training. We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches. We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z)
A Survey of Deep Graph Learning under Distribution Shifts: from Graph Out-of-Distribution Generalization to Adaptation [59.14165404728197]
We provide an up-to-date and forward-looking review of deep graph learning under distribution shifts. Specifically, we cover three primary scenarios: graph OOD generalization, training-time graph OOD adaptation, and test-time graph OOD adaptation. To provide a better understanding of the literature, we introduce a systematic taxonomy that classifies existing methods into model-centric and data-centric approaches.
arXiv Detail & Related papers (2024-10-25T02:39:56Z)
Domain Generalization through Meta-Learning: A Survey [6.524870790082051]
Deep neural networks (DNNs) have revolutionized artificial intelligence but often lack performance when faced with out-of-distribution (OOD) data. This survey paper delves into the realm of meta-learning with a focus on its contribution to domain generalization.
arXiv Detail & Related papers (2024-04-03T14:55:17Z)
Supervised Algorithmic Fairness in Distribution Shifts: A Survey [17.826312801085052]
In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift. This shift can lead to unfair predictions, disproportionately affecting certain groups characterized by sensitive attributes, such as race and gender.
arXiv Detail & Related papers (2024-02-02T11:26:18Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants. Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z)
Identifying the Context Shift between Test Benchmarks and Production Data [1.2259552039796024]
There exists a performance gap between machine learning models' accuracy on dataset benchmarks and real-world production data. We outline two methods for identifying changes in context that lead to distribution shifts and model prediction errors. We present two case-studies to highlight the implicit assumptions underlying applied machine learning models that tend to lead to errors.
arXiv Detail & Related papers (2022-07-03T14:54:54Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Selecting the suitable resampling strategy for imbalanced data classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class. This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples. Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z)
Learning Neural Models for Natural Language Processing in the Face of Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications. It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Information-theoretic Evolution of Model Agnostic Global Explanations [10.921146104622972]
We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data. Our approach has been deployed in a leading digital marketing suite of products.
arXiv Detail & Related papers (2021-05-14T16:52:16Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.