Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification
- URL: http://arxiv.org/abs/2001.01536v3
- Date: Mon, 21 Sep 2020 02:44:16 GMT
- Title: Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification
- Authors: Liuyu Xiang, Guiguang Ding and Jungong Han
- Abstract summary: We propose a self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME)
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods.
- Score: 106.08067870620218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world scenarios, data tends to exhibit a long-tailed distribution,
which increases the difficulty of training deep networks. In this paper, we
propose a novel self-paced knowledge distillation framework, termed Learning
From Multiple Experts (LFME). Our method is inspired by the observation that
networks trained on less imbalanced subsets of the distribution often yield
better performances than their jointly-trained counterparts. We refer to these
models as 'Experts', and the proposed LFME framework aggregates the knowledge
from multiple 'Experts' to learn a unified student model. Specifically, the
proposed framework involves two levels of adaptive learning schedules:
Self-paced Expert Selection and Curriculum Instance Selection, so that the
knowledge is adaptively transferred to the 'Student'. We conduct extensive
experiments and demonstrate that our method is able to achieve superior
performances compared to state-of-the-art methods. We also show that our method
can be easily plugged into state-of-the-art long-tailed classification
algorithms for further improvements.
Related papers
- Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - Reinforcement Learning Based Multi-modal Feature Fusion Network for
Novel Class Discovery [47.28191501836041]
In this paper, we employ a Reinforcement Learning framework to simulate the cognitive processes of humans.
We also deploy a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information.
We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets.
arXiv Detail & Related papers (2023-08-26T07:55:32Z) - Propheter: Prophetic Teacher Guided Long-Tailed Distribution Learning [44.947984354108094]
We propose an innovative long-tailed learning paradigm that breaks the bottleneck by guiding the learning of deep networks with external prior knowledge.
The proposed prophetic paradigm acts as a promising solution to the challenge of limited class knowledge in long-tailed datasets.
arXiv Detail & Related papers (2023-04-09T02:02:19Z) - Learning to Retain while Acquiring: Combating Distribution-Shift in
Adversarial Data-Free Knowledge Distillation [31.294947552032088]
Data-free Knowledge Distillation (DFKD) has gained popularity recently, with the fundamental idea of carrying out knowledge transfer from a Teacher to a Student neural network in the absence of training data.
We propose a meta-learning inspired framework by treating the task of Knowledge-Acquisition (learning from newly generated samples) and Knowledge-Retention (retaining knowledge on previously met samples) as meta-train and meta-test.
arXiv Detail & Related papers (2023-02-28T03:50:56Z) - Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation [20.741353967123366]
We propose a new consistency regularization framework, termed mutual knowledge distillation (MKD)
We use the pseudo-labels generated by a mean teacher to supervise the student network to achieve a mutual knowledge distillation between the two branches.
Our framework outperforms previous state-of-the-art (SOTA) methods under various semi-supervised settings.
arXiv Detail & Related papers (2022-08-24T12:47:58Z) - Guided Deep Metric Learning [0.9786690381850356]
We propose a novel approach to DML that we call Guided Deep Metric Learning.
The proposed method is capable of a better manifold generalization and representation to up to 40% improvement.
arXiv Detail & Related papers (2022-06-04T17:34:11Z) - Class-Balanced Distillation for Long-Tailed Visual Recognition [100.10293372607222]
Real-world imagery is often characterized by a significant imbalance of the number of images per class, leading to long-tailed distributions.
In this work, we introduce a new framework, by making the key observation that a feature representation learned with instance sampling is far from optimal in a long-tailed setting.
Our main contribution is a new training method, that leverages knowledge distillation to enhance feature representations.
arXiv Detail & Related papers (2021-04-12T08:21:03Z) - Knowledge Distillation Meets Self-Supervision [109.6400639148393]
Knowledge distillation involves extracting "dark knowledge" from a teacher network to guide the learning of a student network.
We show that the seemingly different self-supervision task can serve as a simple yet powerful solution.
By exploiting the similarity between those self-supervision signals as an auxiliary task, one can effectively transfer the hidden information from the teacher to the student.
arXiv Detail & Related papers (2020-06-12T12:18:52Z) - Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model
Distillation Approach [55.83558520598304]
We propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation.
We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge.
Our proposed framework, namely Learning and Teaching Categorical Reinforcement, shows promising performance on stabilizing and accelerating learning progress.
arXiv Detail & Related papers (2020-02-06T11:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.