COMPAS: Representation Learning with Compositional Part Sharing for
Few-Shot Classification
- URL: http://arxiv.org/abs/2101.11878v1
- Date: Thu, 28 Jan 2021 09:16:21 GMT
- Title: COMPAS: Representation Learning with Compositional Part Sharing for
Few-Shot Classification
- Authors: Ju He, Adam Kortylewski, Alan Yuille
- Abstract summary: Few-shot image classification consists of two consecutive learning processes.
Inspired by the compositional representation of objects in humans, we train a neural network architecture that explicitly represents objects as a set of parts.
We demonstrate the value of our compositional learning framework for a few-shot classification using miniImageNet, tieredImageNet, CIFAR-FS, and FC100.
- Score: 10.718573053194742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot image classification consists of two consecutive learning processes:
1) In the meta-learning stage, the model acquires a knowledge base from a set
of training classes. 2) During meta-testing, the acquired knowledge is used to
recognize unseen classes from very few examples. Inspired by the compositional
representation of objects in humans, we train a neural network architecture
that explicitly represents objects as a set of parts and their spatial
composition. In particular, during meta-learning, we train a knowledge base
that consists of a dictionary of part representations and a dictionary of part
activation maps that encode frequent spatial activation patterns of parts. The
elements of both dictionaries are shared among the training classes. During
meta-testing, the representation of unseen classes is learned using the part
representations and the part activation maps from the knowledge base. Finally,
an attention mechanism is used to strengthen those parts that are most
important for each category. We demonstrate the value of our compositional
learning framework for a few-shot classification using miniImageNet,
tieredImageNet, CIFAR-FS, and FC100, where we achieve state-of-the-art
performance.
Related papers
- HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot
Learning with Hopfield Network and Soft Mixture of Experts [25.930021907054797]
We propose a novel framework that combines the Modern Hopfield Network with a Mixture of Experts to classify the compositions of previously unseen objects.
Our approach achieves SOTA performance on several benchmarks, including MIT-States and UT-Zappos.
arXiv Detail & Related papers (2023-11-23T07:32:20Z) - Part-aware Prototypical Graph Network for One-shot Skeleton-based Action
Recognition [57.86960990337986]
One-shot skeleton-based action recognition poses unique challenges in learning transferable representation from base classes to novel classes.
We propose a part-aware prototypical representation for one-shot skeleton-based action recognition.
We demonstrate the effectiveness of our method on two public skeleton-based action recognition datasets.
arXiv Detail & Related papers (2022-08-19T04:54:56Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Rectifying the Shortcut Learning of Background: Shared Object
Concentration for Few-Shot Image Recognition [101.59989523028264]
Few-Shot image classification aims to utilize pretrained knowledge learned from a large-scale dataset to tackle a series of downstream classification tasks.
We propose COSOC, a novel Few-Shot Learning framework, to automatically figure out foreground objects at both pretraining and evaluation stage.
arXiv Detail & Related papers (2021-07-16T07:46:41Z) - Learning Graph Embeddings for Compositional Zero-shot Learning [73.80007492964951]
In compositional zero-shot learning, the goal is to recognize unseen compositions of observed visual primitives states.
We propose a novel graph formulation called Compositional Graph Embedding (CGE) that learns image features and latent representations of visual primitives in an end-to-end manner.
By learning a joint compatibility that encodes semantics between concepts, our model allows for generalization to unseen compositions without relying on an external knowledge base like WordNet.
arXiv Detail & Related papers (2021-02-03T10:11:03Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Group Based Deep Shared Feature Learning for Fine-grained Image
Classification [31.84610555517329]
We present a new deep network architecture that explicitly models shared features and removes their effect to achieve enhanced classification results.
We call this framework Group based deep Shared Feature Learning (GSFL) and the resulting learned network as GSFL-Net.
A key benefit of our specialized autoencoder is that it is versatile and can be combined with state-of-the-art fine-grained feature extraction models and trained together with them to improve their performance directly.
arXiv Detail & Related papers (2020-04-04T00:01:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.