Revealing the Proximate Long-Tail Distribution in Compositional
Zero-Shot Learning
- URL: http://arxiv.org/abs/2312.15923v1
- Date: Tue, 26 Dec 2023 07:35:02 GMT
- Title: Revealing the Proximate Long-Tail Distribution in Compositional
Zero-Shot Learning
- Authors: Chenyi Jiang, Haofeng Zhang
- Abstract summary: Compositional Zero-Shot Learning (CZSL) aims to transfer knowledge from seen state-object pairs to novel pairs.
Visual bias caused by predictions of combinations of state-objects blurs their visual features hindering learning of distinguishable class prototypes.
We mathematically deduce the role of class prior within the long-tailed distribution in CZSL.
Building upon this insight, we incorporate visual bias caused by compositions into the classifier's training and inference by estimating it as a proximate class prior.
- Score: 20.837664254230567
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compositional Zero-Shot Learning (CZSL) aims to transfer knowledge from seen
state-object pairs to novel unseen pairs. In this process, visual bias caused
by the diverse interrelationship of state-object combinations blurs their
visual features, hindering the learning of distinguishable class prototypes.
Prevailing methods concentrate on disentangling states and objects directly
from visual features, disregarding potential enhancements that could arise from
a data viewpoint. Experimentally, we unveil the results caused by the above
problem closely approximate the long-tailed distribution. As a solution, we
transform CZSL into a proximate class imbalance problem. We mathematically
deduce the role of class prior within the long-tailed distribution in CZSL.
Building upon this insight, we incorporate visual bias caused by compositions
into the classifier's training and inference by estimating it as a proximate
class prior. This enhancement encourages the classifier to acquire more
discernible class prototypes for each composition, thereby achieving more
balanced predictions. Experimental results demonstrate that our approach
elevates the model's performance to the state-of-the-art level, without
introducing additional parameters. Our code is available at
\url{https://github.com/LanchJL/ProLT-CZSL}.
Related papers
- Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning [23.757252768668497]
Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs.
The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object.
We propose a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem.
arXiv Detail & Related papers (2024-06-21T08:18:30Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Prompting Language-Informed Distribution for Compositional Zero-Shot Learning [73.49852821602057]
Compositional zero-shot learning (CZSL) task aims to recognize unseen compositional visual concepts.
We propose a model by prompting the language-informed distribution, aka., PLID, for the task.
Experimental results on MIT-States, UT-Zappos, and C-GQA datasets show the superior performance of the PLID to the prior arts.
arXiv Detail & Related papers (2023-05-23T18:00:22Z) - Mutual Balancing in State-Object Components for Compositional Zero-Shot
Learning [0.0]
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions from seen states and objects.
We propose a novel method called MUtual balancing in STate-object components (MUST) for CZSL, which provides a balancing inductive bias for the model.
Our approach significantly outperforms the state-of-the-art on MIT-States, UT-Zappos, and C-GQA when combined with the basic CZSL frameworks.
arXiv Detail & Related papers (2022-11-19T10:21:22Z) - ProCC: Progressive Cross-primitive Compatibility for Open-World
Compositional Zero-Shot Learning [29.591615811894265]
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space.
We propose a novel method, termed Progressive Cross-primitive Compatibility (ProCC), to mimic the human learning process for OW-CZSL tasks.
arXiv Detail & Related papers (2022-11-19T10:09:46Z) - Constructing Balance from Imbalance for Long-tailed Image Recognition [50.6210415377178]
The imbalance between majority (head) classes and minority (tail) classes severely skews the data-driven deep neural networks.
Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design.
We propose a concise paradigm by progressively adjusting label space and dividing the head classes and tail classes.
Our proposed model also provides a feature evaluation method and paves the way for long-tailed feature learning.
arXiv Detail & Related papers (2022-08-04T10:22:24Z) - CoSSL: Co-Learning of Representation and Classifier for Imbalanced
Semi-Supervised Learning [98.89092930354273]
We propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL.
To handle the data imbalance, we devise Tail-class Feature Enhancement (TFE) for classifier learning.
In experiments, we show that our approach outperforms other methods over a large range of shifted distributions.
arXiv Detail & Related papers (2021-12-08T20:13:13Z) - Improving Tail-Class Representation with Centroid Contrastive Learning [145.73991900239017]
We propose interpolative centroid contrastive learning (ICCL) to improve long-tailed representation learning.
ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the ICCL can be used to retrieve the centroids for both source classes.
Our result shows a significant accuracy gain of 2.8% on the iNaturalist 2018 dataset with a real-world long-tailed distribution.
arXiv Detail & Related papers (2021-10-19T15:24:48Z) - Class-Balanced Distillation for Long-Tailed Visual Recognition [100.10293372607222]
Real-world imagery is often characterized by a significant imbalance of the number of images per class, leading to long-tailed distributions.
In this work, we introduce a new framework, by making the key observation that a feature representation learned with instance sampling is far from optimal in a long-tailed setting.
Our main contribution is a new training method, that leverages knowledge distillation to enhance feature representations.
arXiv Detail & Related papers (2021-04-12T08:21:03Z) - Transductive Zero-Shot Learning by Decoupled Feature Generation [30.664199050468472]
We focus on the transductive setting, in which unlabelled visual data from unseen classes is available.
We propose to decouple tasks of generating realistic visual features and translating semantic attributes into visual cues.
We present a detailed ablation study to dissect the effect of our proposed decoupling approach, while demonstrating its superiority over the related state-of-the-art.
arXiv Detail & Related papers (2021-02-05T16:17:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.