Bridging the Projection Gap: Overcoming Projection Bias Through Parameterized Distance Learning
- URL: http://arxiv.org/abs/2309.01390v2
- Date: Tue, 2 Apr 2024 05:20:01 GMT
- Title: Bridging the Projection Gap: Overcoming Projection Bias Through Parameterized Distance Learning
- Authors: Chong Zhang, Mingyu Jin, Qinkai Yu, Haochen Xue, Shreyank N Gowda, Xiaobo Jin,
- Abstract summary: Generalized zero-shot learning (GZSL) aims to recognize samples from both seen and unseen classes using only seen class samples for training.
GZSL methods are prone to bias towards seen classes during inference due to the projection function being learned from seen classes.
We address this projection bias by proposing to learn a parameterized Mahalanobis distance metric for robust inference.
- Score: 9.26015904497319
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generalized zero-shot learning (GZSL) aims to recognize samples from both seen and unseen classes using only seen class samples for training. However, GZSL methods are prone to bias towards seen classes during inference due to the projection function being learned from seen classes. Most methods focus on learning an accurate projection, but bias in the projection is inevitable. We address this projection bias by proposing to learn a parameterized Mahalanobis distance metric for robust inference. Our key insight is that the distance computation during inference is critical, even with a biased projection. We make two main contributions - (1) We extend the VAEGAN (Variational Autoencoder \& Generative Adversarial Networks) architecture with two branches to separately output the projection of samples from seen and unseen classes, enabling more robust distance learning. (2) We introduce a novel loss function to optimize the Mahalanobis distance representation and reduce projection bias. Extensive experiments on four datasets show that our approach outperforms state-of-the-art GZSL techniques with improvements of up to 3.5 \% on the harmonic mean metric.
Related papers
- SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method.
We distribute features of space-time tubes evenly across a limited number of learnable clusters.
Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Revealing the Proximate Long-Tail Distribution in Compositional
Zero-Shot Learning [20.837664254230567]
Compositional Zero-Shot Learning (CZSL) aims to transfer knowledge from seen state-object pairs to novel pairs.
Visual bias caused by predictions of combinations of state-objects blurs their visual features hindering learning of distinguishable class prototypes.
We mathematically deduce the role of class prior within the long-tailed distribution in CZSL.
Building upon this insight, we incorporate visual bias caused by compositions into the classifier's training and inference by estimating it as a proximate class prior.
arXiv Detail & Related papers (2023-12-26T07:35:02Z) - Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A
Two-Stage Approach to Mitigate Social Biases [13.837927115198308]
We propose an adversarial training-inspired two-stage debiasing model using Contrastive learning and Continuous Prompt Augmentation.
Our approach guides the model to achieve stronger debiasing performance by adding difficulty to the training process.
arXiv Detail & Related papers (2023-07-04T09:35:03Z) - Federated Zero-Shot Learning for Visual Recognition [55.65879596326147]
We propose a novel Federated Zero-Shot Learning FedZSL framework.
FedZSL learns a central model from the decentralized data residing on edge devices.
The effectiveness and robustness of FedZSL are demonstrated by extensive experiments conducted on three zero-shot benchmark datasets.
arXiv Detail & Related papers (2022-09-05T14:49:34Z) - Towards Unbiased Label Distribution Learning for Facial Pose Estimation
Using Anisotropic Spherical Gaussian [8.597165738132617]
We propose an Anisotropic Spherical Gaussian (ASG)-based LDL approach for facial pose estimation.
In particular, our approach adopts the spherical Gaussian distribution on a unit sphere which constantly generates unbiased expectation.
Our method sets new state-of-the-art records on AFLW2000 and BIWI datasets.
arXiv Detail & Related papers (2022-08-19T02:12:36Z) - A Gating Model for Bias Calibration in Generalized Zero-shot Learning [18.32369721322249]
Generalized zero-shot learning (GZSL) aims at training a model that can generalize to unseen class data by only using auxiliary information.
One of the main challenges in GZSL is a biased model prediction toward seen classes caused by overfitting on only available seen class data during training.
We propose a two-stream autoencoder-based gating model for GZSL.
arXiv Detail & Related papers (2022-03-08T16:41:06Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning [49.04790688256481]
The goal of generalized zero-shot learning (GZSL) is to recognise both seen and unseen classes.
Most GZSL methods typically learn to synthesise visual representations from semantic information on the unseen classes.
We propose a novel framework that leverages dual variational autoencoders with a triplet loss to learn discriminative latent features.
arXiv Detail & Related papers (2021-01-09T05:21:27Z) - Information Bottleneck Constrained Latent Bidirectional Embedding for
Zero-Shot Learning [59.58381904522967]
We propose a novel embedding based generative model with a tight visual-semantic coupling constraint.
We learn a unified latent space that calibrates the embedded parametric distributions of both visual and semantic spaces.
Our method can be easily extended to transductive ZSL setting by generating labels for unseen images.
arXiv Detail & Related papers (2020-09-16T03:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.