An Information-theoretic Approach to Distribution Shifts
- URL: http://arxiv.org/abs/2106.03783v1
- Date: Mon, 7 Jun 2021 16:44:21 GMT
- Title: An Information-theoretic Approach to Distribution Shifts
- Authors: Marco Federici, Ryota Tomioka, Patrick Forr\'e
- Abstract summary: Safely deploying machine learning models to the real world is often a challenging process.
Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere.
neural networks that are fit to a subset of the population might carry some selection bias into their decision process.
- Score: 9.475039534437332
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Safely deploying machine learning models to the real world is often a
challenging process. Models trained with data obtained from a specific
geographic location tend to fail when queried with data obtained elsewhere,
agents trained in a simulation can struggle to adapt when deployed in the real
world or novel environments, and neural networks that are fit to a subset of
the population might carry some selection bias into their decision process. In
this work, we describe the problem of data shift from a novel
information-theoretic perspective by (i) identifying and describing the
different sources of error, (ii) comparing some of the most promising
objectives explored in the recent domain generalization, and fair
classification literature. From our theoretical analysis and empirical
evaluation, we conclude that the model selection procedure needs to be guided
by careful considerations regarding the observed data, the factors used for
correction, and the structure of the data-generating process.
Related papers
- Supervised Algorithmic Fairness in Distribution Shifts: A Survey [17.826312801085052]
In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift.
This shift can lead to unfair predictions, disproportionately affecting certain groups characterized by sensitive attributes, such as race and gender.
arXiv Detail & Related papers (2024-02-02T11:26:18Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Variational $f$-Divergence and Derangements for Discriminative Mutual
Information Estimation [4.114444605090134]
We propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence.
Experiments on reference scenarios demonstrate that our approach outperforms state-of-the-art neural estimators both in terms of accuracy and complexity.
arXiv Detail & Related papers (2023-05-31T16:54:25Z) - Improving Heterogeneous Model Reuse by Density Estimation [105.97036205113258]
This paper studies multiparty learning, aiming to learn a model using the private data of different participants.
Model reuse is a promising solution for multiparty learning, assuming that a local model has been trained for each party.
arXiv Detail & Related papers (2023-05-23T09:46:54Z) - Identifying the Context Shift between Test Benchmarks and Production
Data [1.2259552039796024]
There exists a performance gap between machine learning models' accuracy on dataset benchmarks and real-world production data.
We outline two methods for identifying changes in context that lead to distribution shifts and model prediction errors.
We present two case-studies to highlight the implicit assumptions underlying applied machine learning models that tend to lead to errors.
arXiv Detail & Related papers (2022-07-03T14:54:54Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Learning Neural Models for Natural Language Processing in the Face of
Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications.
It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time.
This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information.
It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Information-theoretic Evolution of Model Agnostic Global Explanations [10.921146104622972]
We present a novel model-agnostic approach that derives rules to globally explain the behavior of classification models trained on numerical and/or categorical data.
Our approach has been deployed in a leading digital marketing suite of products.
arXiv Detail & Related papers (2021-05-14T16:52:16Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.