Related papers: Learning Primitive Relations for Compositional Zero-Shot Learning

Learning Primitive Relations for Compositional Zero-Shot Learning

URL: http://arxiv.org/abs/2501.14308v1
Date: Fri, 24 Jan 2025 08:10:05 GMT
Title: Learning Primitive Relations for Compositional Zero-Shot Learning
Authors: Insu Lee, Jiseob Kim, Kyuhong Shim, Byonghyo Shim,
Abstract summary: We propose a novel framework, learning primitive relations (LPR), designed to probabilistically capture the relationships between states and objects.<n>LPR considers the dependencies between states and objects, enabling the model to infer the likelihood of unseen compositions.
Score: 26.35330980336384
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Compositional Zero-Shot Learning (CZSL) aims to identify unseen state-object compositions by leveraging knowledge learned from seen compositions. Existing approaches often independently predict states and objects, overlooking their relationships. In this paper, we propose a novel framework, learning primitive relations (LPR), designed to probabilistically capture the relationships between states and objects. By employing the cross-attention mechanism, LPR considers the dependencies between states and objects, enabling the model to infer the likelihood of unseen compositions. Experimental results demonstrate that LPR outperforms state-of-the-art methods on all three CZSL benchmark datasets in both closed-world and open-world settings. Through qualitative analysis, we show that LPR leverages state-object relationships for unseen composition prediction.

Related papers

Learning by Imagining: Debiased Feature Augmentation for Compositional Zero-Shot Learning [23.380192229142924]
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen attribute-object compositions by learning prior knowledge of seen primitives.<n>We propose a novel approach called Debiased Feature Augmentation (DeFA) to address these challenges.
arXiv Detail & Related papers (2025-09-16T06:05:31Z)
Feasibility with Language Models for Open-World Compositional Zero-Shot Learning [96.6544564242316]
In Open-World Compositional Zero-Shot Learning, all possible state-object combinations are considered as unseen classes.<n>Our work focuses on using external auxiliary knowledge to determine the feasibility of state-object combinations.
arXiv Detail & Related papers (2025-05-16T12:37:08Z)
Unified Framework for Open-World Compositional Zero-shot Learning [39.521304311470146]
Open-World Compositional Zero-Shot Learning (OW-CZSL) addresses the challenge of recognizing novel compositions of known primitives and entities.<n>We introduce a novel module aimed at alleviating the computational burden associated with exhaustive exploration of all possible compositions during the inference stage.<n>Our proposed model achieves state-of-the-art in OW-CZSL in three datasets, while surpassing Large Vision Language Models (LLVM) in two datasets.
arXiv Detail & Related papers (2024-12-05T11:36:37Z)
Attention Based Simple Primitives for Open World Compositional Zero-Shot Learning [12.558701595138928]
Compositional Zero-Shot Learning (CZSL) aims to predict unknown compositions made up of attribute and object pairs. We are exploring Open World Compositional Zero-Shot Learning (OW-CZSL) in this study, where our test space encompasses all potential combinations of attributes and objects. Our approach involves utilizing the self-attention mechanism between attributes and objects to achieve better generalization from seen to unseen compositions.
arXiv Detail & Related papers (2024-07-18T17:11:29Z)
Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning [23.757252768668497]
Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object. We propose a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem.
arXiv Detail & Related papers (2024-06-21T08:18:30Z)
Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition [61.956088652094515]
Vision and language models (VLMs) have showcased remarkable zero-shot recognition abilities. But they face challenges in visio-linguistic compositionality, particularly in linguistic comprehension and fine-grained image-text alignment. This paper explores the intricate relationship between compositionality and recognition.
arXiv Detail & Related papers (2024-06-13T17:58:39Z)
Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning [15.406125901927004]
We propose a novel framework termed Decomposed Fusion with Soft Prompt (DFSP)1, by involving vision-language models (VLMs) for unseen composition recognition. Specifically, DFSP constructs a vector combination of learnable soft prompts with state and object to establish the joint representation of them. In addition, a cross-modal fusion module is designed between the language and image branches, which decomposes state and object among language features instead of image features.
arXiv Detail & Related papers (2022-11-19T12:29:12Z)
ProCC: Progressive Cross-primitive Compatibility for Open-World Compositional Zero-Shot Learning [29.591615811894265]
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space. We propose a novel method, termed Progressive Cross-primitive Compatibility (ProCC), to mimic the human learning process for OW-CZSL tasks.
arXiv Detail & Related papers (2022-11-19T10:09:46Z)
Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning [86.5258816031722]
The task of Compositional Zero-Shot Learning (CZSL) is to recognize images of novel state-object compositions that are absent during the training stage. Previous methods of learning compositional embedding have shown effectiveness in closed-world CZSL. In Open-World CZSL (OW-CZSL), their performance tends to degrade significantly due to the large cardinality of possible compositions.
arXiv Detail & Related papers (2022-11-05T12:57:06Z)
Learning Attention Propagation for Compositional Zero-Shot Learning [71.55375561183523]
We propose a novel method called Compositional Attention Propagated Embedding (CAPE) CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions. We show that our method outperforms previous baselines to set a new state-of-the-art on three publicly available benchmarks.
arXiv Detail & Related papers (2022-10-20T19:44:11Z)
Siamese Contrastive Embedding Network for Compositional Zero-Shot Learning [76.13542095170911]
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions formed from seen state and object during training. We propose a novel Siamese Contrastive Embedding Network (SCEN) for unseen composition recognition. Our method significantly outperforms the state-of-the-art approaches on three challenging benchmark datasets.
arXiv Detail & Related papers (2022-06-29T09:02:35Z)
KG-SP: Knowledge Guided Simple Primitives for Open World Compositional Zero-Shot Learning [52.422873819371276]
The goal of open-world compositional zero-shot learning (OW-CZSL) is to recognize compositions of state and objects in images. Here, we revisit a simple CZSL baseline and predict the primitives, i.e. states and objects, independently. We estimate the feasibility of each composition through external knowledge, using this prior to remove unfeasible compositions from the output space. Our model, Knowledge-Guided Simple Primitives (KG-SP), achieves state of the art in both OW-CZSL and pCZSL.
arXiv Detail & Related papers (2022-05-13T17:18:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.