Mitigating Biases with Diverse Ensembles and Diffusion Models
- URL: http://arxiv.org/abs/2311.16176v3
- Date: Wed, 6 Mar 2024 21:57:06 GMT
- Title: Mitigating Biases with Diverse Ensembles and Diffusion Models
- Authors: Luca Scimeca, Alexander Rubinstein, Damien Teney, Seong Joon Oh,
Armand Mihai Nicolicioiu, Yoshua Bengio
- Abstract summary: We propose an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on primary shortcut cues, without a need for additional supervised signals.
- Score: 99.6100669122048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spurious correlations in the data, where multiple cues are predictive of the
target labels, often lead to a phenomenon known as shortcut learning, where a
model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In
this work, we propose an ensemble diversification framework exploiting
Diffusion Probabilistic Models (DPMs) to mitigate this form of bias. We show
that at particular training intervals, DPMs can generate images with novel
feature combinations, even when trained on samples displaying correlated input
features. We leverage this crucial property to generate synthetic
counterfactuals to increase model diversity via ensemble disagreement. We show
that DPM-guided diversification is sufficient to remove dependence on primary
shortcut cues, without a need for additional supervised signals. We further
empirically quantify its efficacy on several diversification objectives, and
finally show improved generalization and diversification performance on par
with prior work that relies on auxiliary data collection.
Related papers
- PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - Bayesian Joint Additive Factor Models for Multiview Learning [7.254731344123118]
A motivating application arises in the context of precision medicine where multi-omics data are collected to correlate with clinical outcomes.
We propose a joint additive factor regression model (JAFAR) with a structured additive design, accounting for shared and view-specific components.
Prediction of time-to-labor onset from immunome, metabolome, and proteome data illustrates performance gains against state-of-the-art competitors.
arXiv Detail & Related papers (2024-06-02T15:35:45Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives [5.549794481031468]
Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research.
In this work, we consider a variational objective that can tightly approximate the data log-likelihood.
We develop more flexible aggregation schemes that avoid the inductive biases in PoE or MoE approaches.
arXiv Detail & Related papers (2023-09-01T10:32:21Z) - Diff-Instruct: A Universal Approach for Transferring Knowledge From
Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models.
We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models.
Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z) - Even Small Correlation and Diversity Shifts Pose Dataset-Bias Issues [19.4921353136871]
We study two types of distribution shifts: diversity shifts, which occur when test samples exhibit patterns unseen during training, and correlation shifts, which occur when test data present a different correlation between seen invariant and spurious features.
We propose an integrated protocol to analyze both types of shifts using datasets where they co-exist in a controllable manner.
arXiv Detail & Related papers (2023-05-09T23:40:23Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Learning from demonstration using products of experts: applications to
manipulation and task prioritization [12.378784643460474]
We show that the fusion of models in different task spaces can be expressed as a product of experts (PoE)
Multiple experiments are presented to show that learning the different models jointly in the PoE framework significantly improves the quality of the model.
arXiv Detail & Related papers (2020-10-07T16:24:41Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.