Learning Attention Propagation for Compositional Zero-Shot Learning
- URL: http://arxiv.org/abs/2210.11557v1
- Date: Thu, 20 Oct 2022 19:44:11 GMT
- Title: Learning Attention Propagation for Compositional Zero-Shot Learning
- Authors: Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc Van Gool, Alain
Pagani, Didier Stricker, Muhammad Zeshan Afzal
- Abstract summary: We propose a novel method called Compositional Attention Propagated Embedding (CAPE)
CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.
We show that our method outperforms previous baselines to set a new state-of-the-art on three publicly available benchmarks.
- Score: 71.55375561183523
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Compositional zero-shot learning aims to recognize unseen compositions of
seen visual primitives of object classes and their states. While all primitives
(states and objects) are observable during training in some combination, their
complex interaction makes this task especially hard. For example, wet changes
the visual appearance of a dog very differently from a bicycle. Furthermore, we
argue that relationships between compositions go beyond shared states or
objects. A cluttered office can contain a busy table; even though these
compositions don't share a state or object, the presence of a busy table can
guide the presence of a cluttered office. We propose a novel method called
Compositional Attention Propagated Embedding (CAPE) as a solution. The key
intuition to our method is that a rich dependency structure exists between
compositions arising from complex interactions of primitives in addition to
other dependencies between compositions. CAPE learns to identify this structure
and propagates knowledge between them to learn class embedding for all seen and
unseen compositions. In the challenging generalized compositional zero-shot
setting, we show that our method outperforms previous baselines to set a new
state-of-the-art on three publicly available benchmarks.
Related papers
- Cross-composition Feature Disentanglement for Compositional Zero-shot Learning [49.919635694894204]
Disentanglement of visual features of primitives (i.e., attributes and objects) has shown exceptional results in Compositional Zero-shot Learning (CZSL)
We propose the solution of cross-composition feature disentanglement, which takes multiple primitive-sharing compositions as inputs and constrains the disentangled primitive features to be general across these compositions.
arXiv Detail & Related papers (2024-08-19T08:23:09Z) - Im-Promptu: In-Context Composition from Image Prompts [10.079743487034762]
We investigate whether analogical reasoning can enable in-context composition over composable elements of visual stimuli.
We use Im-Promptu to train agents with different levels of compositionality, including vector representations, patch representations, and object slots.
Our experiments reveal tradeoffs between extrapolation abilities and the degree of compositionality, with non-compositional representations extending learned composition rules to unseen domains but performing poorly on tasks.
arXiv Detail & Related papers (2023-05-26T21:10:11Z) - Siamese Contrastive Embedding Network for Compositional Zero-Shot
Learning [76.13542095170911]
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen compositions formed from seen state and object during training.
We propose a novel Siamese Contrastive Embedding Network (SCEN) for unseen composition recognition.
Our method significantly outperforms the state-of-the-art approaches on three challenging benchmark datasets.
arXiv Detail & Related papers (2022-06-29T09:02:35Z) - Learning to Compose Visual Relations [100.45138490076866]
We propose to represent each relation as an unnormalized density (an energy-based model)
We show that such a factorized decomposition allows the model to both generate and edit scenes with multiple sets of relations more faithfully.
arXiv Detail & Related papers (2021-11-17T18:51:29Z) - Constellation: Learning relational abstractions over objects for
compositional imagination [64.99658940906917]
We introduce Constellation, a network that learns relational abstractions of static visual scenes.
This work is a first step in the explicit representation of visual relationships and using them for complex cognitive procedures.
arXiv Detail & Related papers (2021-07-23T11:59:40Z) - Learning Graph Embeddings for Compositional Zero-shot Learning [73.80007492964951]
In compositional zero-shot learning, the goal is to recognize unseen compositions of observed visual primitives states.
We propose a novel graph formulation called Compositional Graph Embedding (CGE) that learns image features and latent representations of visual primitives in an end-to-end manner.
By learning a joint compatibility that encodes semantics between concepts, our model allows for generalization to unseen compositions without relying on an external knowledge base like WordNet.
arXiv Detail & Related papers (2021-02-03T10:11:03Z) - A causal view of compositional zero-shot recognition [42.63916938252048]
People easily recognize new visual categories that are new combinations of known components.
This compositional generalization capacity is critical for learning in real-world domains like vision and language.
Here we describe an approach for compositional generalization that builds on causal ideas.
arXiv Detail & Related papers (2020-06-25T17:51:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.