A causal view of compositional zero-shot recognition
- URL: http://arxiv.org/abs/2006.14610v2
- Date: Sun, 1 Nov 2020 17:26:29 GMT
- Title: A causal view of compositional zero-shot recognition
- Authors: Yuval Atzmon, Felix Kreuk, Uri Shalit, Gal Chechik
- Abstract summary: People easily recognize new visual categories that are new combinations of known components.
This compositional generalization capacity is critical for learning in real-world domains like vision and language.
Here we describe an approach for compositional generalization that builds on causal ideas.
- Score: 42.63916938252048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: People easily recognize new visual categories that are new combinations of
known components. This compositional generalization capacity is critical for
learning in real-world domains like vision and language because the long tail
of new combinations dominates the distribution. Unfortunately, learning systems
struggle with compositional generalization because they often build on features
that are correlated with class labels even if they are not "essential" for the
class. This leads to consistent misclassification of samples from a new
distribution, like new combinations of known components.
Here we describe an approach for compositional generalization that builds on
causal ideas. First, we describe compositional zero-shot learning from a causal
perspective, and propose to view zero-shot inference as finding "which
intervention caused the image?". Second, we present a causal-inspired embedding
model that learns disentangled representations of elementary components of
visual objects from correlated (confounded) training data. We evaluate this
approach on two datasets for predicting new combinations of attribute-object
pairs: A well-controlled synthesized images dataset and a real-world dataset
which consists of fine-grained types of shoes. We show improvements compared to
strong baselines.
Related papers
- DXAI: Explaining Classification by Image Decomposition [4.013156524547072]
We propose a new way to visualize neural network classification through a decomposition-based explainable AI (DXAI)
Instead of providing an explanation heatmap, our method yields a decomposition of the image into class-agnostic and class-distinct parts.
arXiv Detail & Related papers (2023-12-30T20:52:20Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Learning Attention Propagation for Compositional Zero-Shot Learning [71.55375561183523]
We propose a novel method called Compositional Attention Propagated Embedding (CAPE)
CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.
We show that our method outperforms previous baselines to set a new state-of-the-art on three publicly available benchmarks.
arXiv Detail & Related papers (2022-10-20T19:44:11Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Independent Prototype Propagation for Zero-Shot Compositionality [1.2676356746752893]
We propose ProtoProp, a novel prototype propagation graph method.
First we learn prototypical representations of objects that are conditionally independent.
Next we propagate the independent prototypes through a compositional graph.
We show that in the generalized compositional zero-shot setting we outperform state-of-the-art results.
arXiv Detail & Related papers (2021-06-01T08:24:09Z) - Learning Graph Embeddings for Open World Compositional Zero-Shot
Learning [47.09665742252187]
Compositional Zero-Shot learning (CZSL) aims to recognize unseen compositions of state and object visual primitives seen during training.
We propose a new approach, Compositional Cosine Graph Embeddings (Co-CGE)
Co-CGE models the dependency between states, objects and their compositions through a graph convolutional neural network.
arXiv Detail & Related papers (2021-05-03T17:08:21Z) - Learning Graph Embeddings for Compositional Zero-shot Learning [73.80007492964951]
In compositional zero-shot learning, the goal is to recognize unseen compositions of observed visual primitives states.
We propose a novel graph formulation called Compositional Graph Embedding (CGE) that learns image features and latent representations of visual primitives in an end-to-end manner.
By learning a joint compatibility that encodes semantics between concepts, our model allows for generalization to unseen compositions without relying on an external knowledge base like WordNet.
arXiv Detail & Related papers (2021-02-03T10:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.