Compositional Fine-Grained Low-Shot Learning
- URL: http://arxiv.org/abs/2105.10438v1
- Date: Fri, 21 May 2021 16:18:24 GMT
- Title: Compositional Fine-Grained Low-Shot Learning
- Authors: Dat Huynh and Ehsan Elhamifar
- Abstract summary: We develop a novel compositional generative model for zero- and few-shot learning to recognize fine-grained classes with a few or no training samples.
We propose a feature composition framework that learns to extract attribute features from training samples and combines them to construct fine-grained features for rare and unseen classes.
- Score: 58.53111180904687
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a novel compositional generative model for zero- and few-shot
learning to recognize fine-grained classes with a few or no training samples.
Our key observation is that generating holistic features for fine-grained
classes fails to capture small attribute differences between classes.
Therefore, we propose a feature composition framework that learns to extract
attribute features from training samples and combines them to construct
fine-grained features for rare and unseen classes. Feature composition allows
us to not only selectively compose features of every class from only relevant
training samples, but also obtain diversity among composed features via
changing samples used for the composition. In addition, instead of building
holistic features for classes, we use our attribute features to form dense
representations capable of capturing fine-grained attribute details of classes.
We propose a training scheme that uses a discriminative model to construct
features that are subsequently used to train the model itself. Therefore, we
directly train the discriminative model on the composed features without
learning a separate generative model. We conduct experiments on four popular
datasets of DeepFashion, AWA2, CUB, and SUN, showing the effectiveness of our
method.
Related papers
- Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification [78.52704557647438]
We propose a novel FIne-grained Representation and Recomposition (FIRe$2$) framework to tackle both limitations without any auxiliary annotation or data.
Experiments demonstrate that FIRe$2$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks.
arXiv Detail & Related papers (2023-08-21T12:59:48Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - Generalization Bounds for Few-Shot Transfer Learning with Pretrained
Classifiers [26.844410679685424]
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes.
We show that the few-shot error of the learned feature map on new classes is small in case of class-feature-variability collapse.
arXiv Detail & Related papers (2022-12-23T18:46:05Z) - Feature Weaken: Vicinal Data Augmentation for Classification [1.7013938542585925]
We use Feature Weaken to construct the vicinal data distribution with the same cosine similarity for model training.
This work can not only improve the classification performance and generalization of the model, but also stabilize the model training and accelerate the model convergence.
arXiv Detail & Related papers (2022-11-20T11:00:23Z) - Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup [14.37428912254029]
Mixup is a data augmentation technique that relies on training using random convex combinations of data points and their labels.
We focus on classification problems in which each class may have multiple associated features (or views) that can be used to predict the class correctly.
Our main theoretical results demonstrate that, for a non-trivial class of data distributions with two features per class, training a 2-layer convolutional network using empirical risk minimization can lead to learning only one feature for almost all classes while training with a specific instantiation of Mixup succeeds in learning both features for every class.
arXiv Detail & Related papers (2022-10-24T18:11:37Z) - Boosting Generative Zero-Shot Learning by Synthesizing Diverse Features
with Attribute Augmentation [21.72622601533585]
We propose a novel framework to boost Zero-Shot Learning (ZSL) by synthesizing diverse features.
This method uses augmented semantic attributes to train the generative model, so as to simulate the real distribution of visual features.
We evaluate the proposed model on four benchmark datasets, observing significant performance improvement against the state-of-the-art.
arXiv Detail & Related papers (2021-12-23T14:32:51Z) - Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation.
In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples.
We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z) - Semi-Supervised Few-Shot Classification with Deep Invertible Hybrid
Models [4.189643331553922]
We propose a deep invertible hybrid model which integrates discriminative and generative learning at a latent space level for semi-supervised few-shot classification.
Our main originality lies in our integration of these components at a latent space level, which is effective in preventing overfitting.
arXiv Detail & Related papers (2021-05-22T05:55:16Z) - CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action
Recognition [52.66360172784038]
We propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually.
We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets.
arXiv Detail & Related papers (2021-01-18T12:46:24Z) - A Few-Shot Sequential Approach for Object Counting [63.82757025821265]
We introduce a class attention mechanism that sequentially attends to objects in the image and extracts their relevant features.
The proposed technique is trained on point-level annotations and uses a novel loss function that disentangles class-dependent and class-agnostic aspects of the model.
We present our results on a variety of object-counting/detection datasets, including FSOD and MS COCO.
arXiv Detail & Related papers (2020-07-03T18:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.