Learning Knowledge Representation with Meta Knowledge Distillation for
Single Image Super-Resolution
- URL: http://arxiv.org/abs/2207.08356v1
- Date: Mon, 18 Jul 2022 02:41:04 GMT
- Title: Learning Knowledge Representation with Meta Knowledge Distillation for
Single Image Super-Resolution
- Authors: Han Zhu, Zhenzhong Chen, Shan Liu
- Abstract summary: We propose a model-agnostic meta knowledge distillation method under the teacher-student architecture for the single image super-resolution task.
Experiments conducted on various single image super-resolution datasets demonstrate that our proposed method outperforms existing defined knowledge representation related distillation methods.
- Score: 82.89021683451432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation (KD), which can efficiently transfer knowledge from a
cumbersome network (teacher) to a compact network (student), has demonstrated
its advantages in some computer vision applications. The representation of
knowledge is vital for knowledge transferring and student learning, which is
generally defined in hand-crafted manners or uses the intermediate features
directly. In this paper, we propose a model-agnostic meta knowledge
distillation method under the teacher-student architecture for the single image
super-resolution task. It provides a more flexible and accurate way to help the
teachers transmit knowledge in accordance with the abilities of students via
knowledge representation networks (KRNets) with learnable parameters. In order
to improve the perception ability of knowledge representation to students'
requirements, we propose to solve the transformation process from intermediate
outputs to transferred knowledge by employing the student features and the
correlation between teacher and student in the KRNets. Specifically, the
texture-aware dynamic kernels are generated and then extract texture features
to be improved and the corresponding teacher guidance so as to decompose the
distillation problem into texture-wise supervision for further promoting the
recovery quality of high-frequency details. In addition, the KRNets are
optimized in a meta-learning manner to ensure the knowledge transferring and
the student learning are beneficial to improving the reconstructed quality of
the student. Experiments conducted on various single image super-resolution
datasets demonstrate that our proposed method outperforms existing defined
knowledge representation related distillation methods, and can help
super-resolution algorithms achieve better reconstruction quality without
introducing any inference complexity.
Related papers
- Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation [11.754014876977422]
This paper introduces a novel perspective emphasizing student-oriented and refining the teacher's knowledge to better align with the student's needs.
We present the Student-Oriented Knowledge Distillation (SoKD), which incorporates a learnable feature augmentation strategy during training.
We also deploy the Distinctive Area Detection Module (DAM) to identify areas of mutual interest between the teacher and student.
arXiv Detail & Related papers (2024-09-27T14:34:08Z) - Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation
for Scene Recognition [64.29650787243443]
We propose and analyse the use of a 2D frequency transform of the activation maps before transferring them.
This strategy enhances knowledge transferability in tasks such as scene recognition.
We publicly release the training and evaluation framework used along this paper at http://www.vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.
arXiv Detail & Related papers (2022-05-04T11:05:18Z) - Knowledge Distillation By Sparse Representation Matching [107.87219371697063]
We propose Sparse Representation Matching (SRM) to transfer intermediate knowledge from one Convolutional Network (CNN) to another by utilizing sparse representation.
We formulate as a neural processing block, which can be efficiently optimized using gradient descent and integrated into any CNN in a plug-and-play manner.
Our experiments demonstrate that is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.
arXiv Detail & Related papers (2021-03-31T11:47:47Z) - Deep Ensemble Collaborative Learning by using Knowledge-transfer Graph
for Fine-grained Object Classification [9.49864824780503]
The performance of ensembles of networks that have undergone mutual learning does not improve significantly from that of normal ensembles without mutual learning.
This may be due to the relationship between the knowledge in mutual learning and the individuality of the networks in the ensemble.
We propose an ensemble method using knowledge transfer to improve the accuracy of ensembles by introducing a loss design that promotes diversity among networks in mutual learning.
arXiv Detail & Related papers (2021-03-27T08:56:00Z) - Learning Student-Friendly Teacher Networks for Knowledge Distillation [50.11640959363315]
We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student.
Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students.
arXiv Detail & Related papers (2021-02-12T07:00:17Z) - Point Adversarial Self Mining: A Simple Method for Facial Expression
Recognition [79.75964372862279]
We propose Point Adversarial Self Mining (PASM) to improve the recognition accuracy in facial expression recognition.
PASM uses a point adversarial attack method and a trained teacher network to locate the most informative position related to the target task.
The adaptive learning materials generation and teacher/student update can be conducted more than one time, improving the network capability iteratively.
arXiv Detail & Related papers (2020-08-26T06:39:24Z) - Knowledge Distillation Meets Self-Supervision [109.6400639148393]
Knowledge distillation involves extracting "dark knowledge" from a teacher network to guide the learning of a student network.
We show that the seemingly different self-supervision task can serve as a simple yet powerful solution.
By exploiting the similarity between those self-supervision signals as an auxiliary task, one can effectively transfer the hidden information from the teacher to the student.
arXiv Detail & Related papers (2020-06-12T12:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.