Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
Functional Entropies
- URL: http://arxiv.org/abs/2010.10802v1
- Date: Wed, 21 Oct 2020 07:40:33 GMT
- Title: Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
Functional Entropies
- Authors: Itai Gat and Idan Schwartz and Alexander Schwing and Tamir Hazan
- Abstract summary: Some modalities can more easily contribute to the classification results than others.
We develop a method based on the log-Sobolev inequality, which bounds the functional entropy with the functional-Fisher-information.
On the two challenging multi-modal datasets VQA-CPv2 and SocialIQ, we obtain state-of-the-art results while more uniformly exploiting the modalities.
- Score: 88.0813215220342
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Many recent datasets contain a variety of different data modalities, for
instance, image, question, and answer data in visual question answering (VQA).
When training deep net classifiers on those multi-modal datasets, the
modalities get exploited at different scales, i.e., some modalities can more
easily contribute to the classification results than others. This is suboptimal
because the classifier is inherently biased towards a subset of the modalities.
To alleviate this shortcoming, we propose a novel regularization term based on
the functional entropy. Intuitively, this term encourages to balance the
contribution of each modality to the classification result. However,
regularization with the functional entropy is challenging. To address this, we
develop a method based on the log-Sobolev inequality, which bounds the
functional entropy with the functional-Fisher-information. Intuitively, this
maximizes the amount of information that the modalities contribute. On the two
challenging multi-modal datasets VQA-CPv2 and SocialIQ, we obtain
state-of-the-art results while more uniformly exploiting the modalities. In
addition, we demonstrate the efficacy of our method on Colored MNIST.
Related papers
- A Complete Decomposition of KL Error using Refined Information and Mode Interaction Selection [11.994525728378603]
We revisit the classical formulation of the log-linear model with a focus on higher-order mode interactions.
We find that our learned distributions are able to more efficiently use the finite amount of data which is available in practice.
arXiv Detail & Related papers (2024-10-15T18:08:32Z) - AnyLoss: Transforming Classification Metrics into Loss Functions [21.34290540936501]
evaluation metrics can be used to assess the performance of models in binary classification tasks.
Most metrics are derived from a confusion matrix in a non-differentiable form, making it difficult to generate a differentiable loss function that could directly optimize them.
We propose a general-purpose approach that transforms any confusion matrix-based metric into a loss function, textitAnyLoss, that is available in optimization processes.
arXiv Detail & Related papers (2024-05-23T16:14:16Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Learning Unseen Modality Interaction [54.23533023883659]
Multimodal learning assumes all modality combinations of interest are available during training to learn cross-modal correspondences.
We pose the problem of unseen modality interaction and introduce a first solution.
It exploits a module that projects the multidimensional features of different modalities into a common space with rich information preserved.
arXiv Detail & Related papers (2023-06-22T10:53:10Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Perceptual Score: What Data Modalities Does Your Model Perceive? [73.75255606437808]
We introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features.
We find that recent, more accurate multi-modal models for visual question-answering tend to perceive the visual data less than their predecessors.
Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions.
arXiv Detail & Related papers (2021-10-27T12:19:56Z) - Learning to Transfer with von Neumann Conditional Divergence [14.926485055255942]
We introduce the recently proposed von Neumann conditional divergence to improve the transferability across multiple domains.
We design novel learning objectives assuming those source tasks are observed either simultaneously or sequentially.
In both scenarios, we obtain favorable performance against state-of-the-art methods in terms of smaller generalization error on new tasks and less catastrophic forgetting on source tasks (in the sequential setup)
arXiv Detail & Related papers (2021-08-07T22:18:23Z) - Generalized Entropy Regularization or: There's Nothing Special about
Label Smoothing [83.78668073898001]
We introduce a family of entropy regularizers, which includes label smoothing as a special case.
We find that variance in model performance can be explained largely by the resulting entropy of the model.
We advise the use of other entropy regularization methods in its place.
arXiv Detail & Related papers (2020-05-02T12:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.