Multigroup Robustness
- URL: http://arxiv.org/abs/2405.00614v1
- Date: Wed, 1 May 2024 16:35:04 GMT
- Title: Multigroup Robustness
- Authors: Lunjia Hu, Charlotte Peale, Judy Hanwen Shen,
- Abstract summary: We study multigroup robust algorithms whose robustness guarantees for each subpopulation only degrade with the amount of data corruption inside that subpopulation.
Our techniques establish a new connection between multigroup fairness and robustness.
- Score: 5.659543670443081
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions about people from a rich collection of overlapping subpopulations, we initiate the study of multigroup robust algorithms whose robustness guarantees for each subpopulation only degrade with the amount of data corruption inside that subpopulation. When the data corruption is not distributed uniformly over subpopulations, our algorithms provide more meaningful robustness guarantees than standard guarantees that are oblivious to how the data corruption and the affected subpopulations are related. Our techniques establish a new connection between multigroup fairness and robustness.
Related papers
- Group Distributionally Robust Dataset Distillation with Risk
Minimization [18.07189444450016]
We introduce an algorithm that combines clustering with the minimization of a risk measure on the loss to conduct DD.
We demonstrate its effective generalization and robustness across subgroups through numerical experiments.
arXiv Detail & Related papers (2024-02-07T09:03:04Z) - Mitigating Group Bias in Federated Learning for Heterogeneous Devices [1.181206257787103]
Federated Learning is emerging as a privacy-preserving model training approach in distributed edge applications.
Our work proposes a group-fair FL framework that minimizes group-bias while preserving privacy and without resource utilization overhead.
arXiv Detail & Related papers (2023-09-13T16:53:48Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with
Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data.
We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets.
We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Robust Trimmed k-means [70.88503833248159]
We propose Robust Trimmed k-means (RTKM) that simultaneously identifies outliers and clusters points.
We show RTKM performs competitively with other methods on single membership data with outliers and multi-membership data without outliers.
arXiv Detail & Related papers (2021-08-16T15:49:40Z) - Learning Deep Neural Networks under Agnostic Corrupted Supervision [37.441467641123026]
We present an efficient robust algorithm that achieves strong guarantees without any assumption on the type of corruption.
Our algorithm focuses on controlling the collective impact of data points on the average gradient.
Experiments on multiple benchmark datasets have demonstrated the robustness of our algorithm under different types of corruption.
arXiv Detail & Related papers (2021-02-12T19:36:04Z) - Bayesian Semi-supervised Crowdsourcing [71.20185379303479]
Crowdsourcing has emerged as a powerful paradigm for efficiently labeling large datasets and performing various learning tasks.
This work deals with semi-supervised crowdsourced classification, under two regimes of semi-supervision.
arXiv Detail & Related papers (2020-12-20T23:18:51Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - CycleCluster: Modernising Clustering Regularisation for Deep
Semi-Supervised Classification [0.0]
We propose a novel framework, CycleCluster, for deep semi-supervised classification.
Our core optimisation is driven by a new clustering based regularisation along with a graph based pseudo-labels and a shared deep network.
arXiv Detail & Related papers (2020-01-15T13:34:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.