ProCC: Progressive Cross-primitive Compatibility for Open-World
Compositional Zero-Shot Learning
- URL: http://arxiv.org/abs/2211.12417v4
- Date: Fri, 15 Dec 2023 11:50:32 GMT
- Title: ProCC: Progressive Cross-primitive Compatibility for Open-World
Compositional Zero-Shot Learning
- Authors: Fushuo Huo, Wenchao Xu, Song Guo, Jingcai Guo, Haozhao Wang, Ziming
Liu, Xiaocheng Lu
- Abstract summary: Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space.
We propose a novel method, termed Progressive Cross-primitive Compatibility (ProCC), to mimic the human learning process for OW-CZSL tasks.
- Score: 29.591615811894265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel
compositions of state and object primitives in images with no priors on the
compositional space, which induces a tremendously large output space containing
all possible state-object compositions. Existing works either learn the joint
compositional state-object embedding or predict simple primitives with separate
classifiers. However, the former heavily relies on external word embedding
methods, and the latter ignores the interactions of interdependent primitives,
respectively. In this paper, we revisit the primitive prediction approach and
propose a novel method, termed Progressive Cross-primitive Compatibility
(ProCC), to mimic the human learning process for OW-CZSL tasks. Specifically,
the cross-primitive compatibility module explicitly learns to model the
interactions of state and object features with the trainable memory units,
which efficiently acquires cross-primitive visual attention to reason
high-feasibility compositions, without the aid of external knowledge. Moreover,
considering the partial-supervision setting (pCZSL) as well as the imbalance
issue of multiple task prediction, we design a progressive training paradigm to
enable the primitive classifiers to interact to obtain discriminative
information in an easy-to-hard manner. Extensive experiments on three widely
used benchmark datasets demonstrate that our method outperforms other
representative methods on both OW-CZSL and pCZSL settings by large margins.
Related papers
- Attention Based Simple Primitives for Open World Compositional Zero-Shot Learning [12.558701595138928]
Compositional Zero-Shot Learning (CZSL) aims to predict unknown compositions made up of attribute and object pairs.
We are exploring Open World Compositional Zero-Shot Learning (OW-CZSL) in this study, where our test space encompasses all potential combinations of attributes and objects.
Our approach involves utilizing the self-attention mechanism between attributes and objects to achieve better generalization from seen to unseen compositions.
arXiv Detail & Related papers (2024-07-18T17:11:29Z) - Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning [23.757252768668497]
Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs.
The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object.
We propose a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem.
arXiv Detail & Related papers (2024-06-21T08:18:30Z) - Supervised Stochastic Neighbor Embedding Using Contrastive Learning [4.560284382063488]
Clusters of samples belonging to the same class are pulled together in low-dimensional embedding space.
We extend the self-supervised contrastive approach to the fully-supervised setting, allowing us to effectively leverage label information.
arXiv Detail & Related papers (2023-09-15T00:26:21Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Distilled Reverse Attention Network for Open-world Compositional
Zero-Shot Learning [42.138756191997295]
Open-World Compositional Zero-Shot Learning (OW-CZSL) aims to recognize new compositions of seen attributes and objects.
OW-CZSL methods built on the conventional closed-world setting degrade severely due to the unconstrained OW test space.
We propose a novel Distilled Reverse Attention Network to address the challenges.
arXiv Detail & Related papers (2023-03-01T10:52:20Z) - Simple Primitives with Feasibility- and Contextuality-Dependence for
Open-World Compositional Zero-shot Learning [86.5258816031722]
The task of Compositional Zero-Shot Learning (CZSL) is to recognize images of novel state-object compositions that are absent during the training stage.
Previous methods of learning compositional embedding have shown effectiveness in closed-world CZSL.
In Open-World CZSL (OW-CZSL), their performance tends to degrade significantly due to the large cardinality of possible compositions.
arXiv Detail & Related papers (2022-11-05T12:57:06Z) - Siamese Contrastive Embedding Network for Compositional Zero-Shot
Learning [76.13542095170911]
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions formed from seen state and object during training.
We propose a novel Siamese Contrastive Embedding Network (SCEN) for unseen composition recognition.
Our method significantly outperforms the state-of-the-art approaches on three challenging benchmark datasets.
arXiv Detail & Related papers (2022-06-29T09:02:35Z) - KG-SP: Knowledge Guided Simple Primitives for Open World Compositional
Zero-Shot Learning [52.422873819371276]
The goal of open-world compositional zero-shot learning (OW-CZSL) is to recognize compositions of state and objects in images.
Here, we revisit a simple CZSL baseline and predict the primitives, i.e. states and objects, independently.
We estimate the feasibility of each composition through external knowledge, using this prior to remove unfeasible compositions from the output space.
Our model, Knowledge-Guided Simple Primitives (KG-SP), achieves state of the art in both OW-CZSL and pCZSL.
arXiv Detail & Related papers (2022-05-13T17:18:15Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.