Dual-View Data Hallucination with Semantic Relation Guidance for
Few-Shot Image Recognition
- URL: http://arxiv.org/abs/2401.07061v1
- Date: Sat, 13 Jan 2024 12:32:29 GMT
- Title: Dual-View Data Hallucination with Semantic Relation Guidance for
Few-Shot Image Recognition
- Authors: Hefeng Wu, Guangzhi Ye, Ziyang Zhou, Ling Tian, Qing Wang, Liang Lin
- Abstract summary: We propose a framework that exploits semantic relations to guide dual-view data hallucination for few-shot image recognition.
An instance-view data hallucination module hallucinates each sample of a novel class to generate new data.
A prototype-view data hallucination module exploits semantic-aware measure to estimate the prototype of a novel class.
- Score: 52.19737194653999
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning to recognize novel concepts from just a few image samples is very
challenging as the learned model is easily overfitted on the few data and
results in poor generalizability. One promising but underexplored solution is
to compensate the novel classes by generating plausible samples. However, most
existing works of this line exploit visual information only, rendering the
generated data easy to be distracted by some challenging factors contained in
the few available samples. Being aware of the semantic information in the
textual modality that reflects human concepts, this work proposes a novel
framework that exploits semantic relations to guide dual-view data
hallucination for few-shot image recognition. The proposed framework enables
generating more diverse and reasonable data samples for novel classes through
effective information transfer from base classes. Specifically, an
instance-view data hallucination module hallucinates each sample of a novel
class to generate new data by employing local semantic correlated attention and
global semantic feature fusion derived from base classes. Meanwhile, a
prototype-view data hallucination module exploits semantic-aware measure to
estimate the prototype of a novel class and the associated distribution from
the few samples, which thereby harvests the prototype as a more stable sample
and enables resampling a large number of samples. We conduct extensive
experiments and comparisons with state-of-the-art methods on several popular
few-shot benchmarks to verify the effectiveness of the proposed framework.
Related papers
- Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - Diversified in-domain synthesis with efficient fine-tuning for few-shot
classification [64.86872227580866]
Few-shot image classification aims to learn an image classifier using only a small set of labeled examples per class.
We propose DISEF, a novel approach which addresses the generalization challenge in few-shot learning using synthetic data.
We validate our method in ten different benchmarks, consistently outperforming baselines and establishing a new state-of-the-art for few-shot classification.
arXiv Detail & Related papers (2023-12-05T17:18:09Z) - VDC: Versatile Data Cleanser based on Visual-Linguistic Inconsistency by Multimodal Large Language Models [46.72546879204724]
In the real-world, datasets may contain dirty samples, such as poisoned samples from backdoor attack, noisy labels in crowdsourcing, and even hybrids of them.
Existing detectors only focus on detecting poisoned samples or noisy labels, that are often prone to weak generalization when dealing with dirty samples from other domains.
We propose versatile data cleanser (VDC) leveraging the surpassing capabilities of multimodal large language models (MLLM) in cross-modal alignment and reasoning.
arXiv Detail & Related papers (2023-09-28T07:37:18Z) - Semantic Prompt for Few-Shot Image Recognition [76.68959583129335]
We propose a novel Semantic Prompt (SP) approach for few-shot learning.
The proposed approach achieves promising results, improving the 1-shot learning accuracy by 3.67% on average.
arXiv Detail & Related papers (2023-03-24T16:32:19Z) - Uncertainty-Aware Multi-View Representation Learning [53.06828186507994]
We devise a novel unsupervised multi-view learning approach, termed as Dynamic Uncertainty-Aware Networks (DUA-Nets)
Guided by the uncertainty of data estimated from the generation perspective, intrinsic information from multiple views is integrated to obtain noise-free representations.
Our model achieves superior performance in extensive experiments and shows the robustness to noisy data.
arXiv Detail & Related papers (2022-01-15T07:16:20Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z) - Interpretable Time-series Classification on Few-shot Samples [27.05851877375113]
This paper proposes an interpretable neural-based framework, namely textitDual Prototypical Shapelet Networks (DPSN) for few-shot time-series classification.
DPSN interprets the model from dual granularity: 1) global overview using representative time series samples, and 2) local highlights using discriminative shapelets.
We have derived 18 few-shot TSC datasets from public benchmark datasets and evaluated the proposed method by comparing with baselines.
arXiv Detail & Related papers (2020-06-03T03:47:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.