Few-shot Learning with Global Relatedness Decoupled-Distillation
- URL: http://arxiv.org/abs/2107.05583v1
- Date: Mon, 12 Jul 2021 17:01:11 GMT
- Title: Few-shot Learning with Global Relatedness Decoupled-Distillation
- Authors: Yuan Zhou and Yanrong Guo and Shijie Hao and Richang Hong and Zhen
junzha and Meng Wang
- Abstract summary: We propose a new Global Relatedness Decoupled-Distillation (GRDD) method using the global category knowledge and the Relatedness Decoupled-Distillation (RDD) strategy.
Our GRDD learns new visual concepts quickly by imitating the habit of humans, i.e. learning from the deep knowledge distilled from the teacher.
- Score: 47.78903405454224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the success that metric learning based approaches have achieved in
few-shot learning, recent works reveal the ineffectiveness of their episodic
training mode. In this paper, we point out two potential reasons for this
problem: 1) the random episodic labels can only provide limited supervision
information, while the relatedness information between the query and support
samples is not fully exploited; 2) the meta-learner is usually constrained by
the limited contextual information of the local episode. To overcome these
problems, we propose a new Global Relatedness Decoupled-Distillation (GRDD)
method using the global category knowledge and the Relatedness
Decoupled-Distillation (RDD) strategy. Our GRDD learns new visual concepts
quickly by imitating the habit of humans, i.e. learning from the deep knowledge
distilled from the teacher. More specifically, we first train a global learner
on the entire base subset using category labels as supervision to leverage the
global context information of the categories. Then, the well-trained global
learner is used to simulate the query-support relatedness in global
dependencies. Finally, the distilled global query-support relatedness is
explicitly used to train the meta-learner using the RDD strategy, with the goal
of making the meta-learner more discriminative. The RDD strategy aims to
decouple the dense query-support relatedness into the groups of sparse
decoupled relatedness. Moreover, only the relatedness of a single support
sample with other query samples is considered in each group. By distilling the
sparse decoupled relatedness group by group, sharper relatedness can be
effectively distilled to the meta-learner, thereby facilitating the learning of
a discriminative meta-learner. We conduct extensive experiments on the
miniImagenet and CIFAR-FS datasets, which show the state-of-the-art performance
of our GRDD method.
Related papers
- GLRT-Based Metric Learning for Remote Sensing Object Retrieval [19.210692452537007]
Existing CBRSOR methods neglect the utilization of global statistical information during both training and test stages.
Inspired by the Neyman-Pearson theorem, we propose a generalized likelihood ratio test-based metric learning (GLRTML) approach.
arXiv Detail & Related papers (2024-10-08T07:53:30Z) - Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge.
We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks.
Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Distilling Object Detectors With Global Knowledge [30.67375886569278]
Existing methods regard the knowledge as the feature of each instance or their relations, which is the instance-level knowledge only from the teacher model.
A more intrinsic approach is to measure the representations of instances w.r.t. a group of common basis vectors in the two feature spaces of the teacher and the student detectors.
We show that our method achieves the best performance for distilling object detectors with various datasets backbones.
arXiv Detail & Related papers (2022-10-17T12:44:33Z) - The Role of Global Labels in Few-Shot Classification and How to Infer
Them [55.64429518100676]
Few-shot learning is a central problem in meta-learning, where learners must quickly adapt to new tasks.
We propose Meta Label Learning (MeLa), a novel algorithm that infers global labels and obtains robust few-shot models via standard classification.
arXiv Detail & Related papers (2021-08-09T14:07:46Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Unsupervised Feature Learning by Cross-Level Instance-Group
Discrimination [68.83098015578874]
We integrate between-instance similarity into contrastive learning, not directly by instance grouping, but by cross-level discrimination.
CLD effectively brings unsupervised learning closer to natural data and real-world applications.
New state-of-the-art on self-supervision, semi-supervision, and transfer learning benchmarks, and beats MoCo v2 and SimCLR on every reported performance.
arXiv Detail & Related papers (2020-08-09T21:13:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.