Feature-domain Adaptive Contrastive Distillation for Efficient Single
Image Super-Resolution
- URL: http://arxiv.org/abs/2211.15951v2
- Date: Fri, 24 Mar 2023 06:05:59 GMT
- Title: Feature-domain Adaptive Contrastive Distillation for Efficient Single
Image Super-Resolution
- Authors: HyeonCheol Moon, JinWoo Jeong, SungJei Kim
- Abstract summary: CNN-based SISR has numerous parameters and high computational cost to achieve better performance.
Knowledge Distillation (KD) transfers teacher's useful knowledge to student.
We propose a feature-domain adaptive contrastive distillation (FACD) method for efficiently training lightweight student SISR networks.
- Score: 3.2453621806729234
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recently, CNN-based SISR has numerous parameters and high computational cost
to achieve better performance, limiting its applicability to
resource-constrained devices such as mobile. As one of the methods to make the
network efficient, Knowledge Distillation (KD), which transfers teacher's
useful knowledge to student, is currently being studied. More recently, KD for
SISR utilizes Feature Distillation (FD) to minimize the Euclidean distance loss
of feature maps between teacher and student networks, but it does not
sufficiently consider how to effectively and meaningfully deliver knowledge
from teacher to improve the student performance at given network capacity
constraints. In this paper, we propose a feature-domain adaptive contrastive
distillation (FACD) method for efficiently training lightweight student SISR
networks. We show the limitations of the existing FD methods using Euclidean
distance loss, and propose a feature-domain contrastive loss that makes a
student network learn richer information from the teacher's representation in
the feature domain. In addition, we propose an adaptive distillation that
selectively applies distillation depending on the conditions of the training
patches. The experimental results show that the student EDSR and RCAN networks
with the proposed FACD scheme improves not only the PSNR performance of the
entire benchmark datasets and scales, but also the subjective image quality
compared to the conventional FD approaches.
Related papers
- Relative Difficulty Distillation for Semantic Segmentation [54.76143187709987]
We propose a pixel-level KD paradigm for semantic segmentation named Relative Difficulty Distillation (RDD)
RDD allows the teacher network to provide effective guidance on learning focus without additional optimization goals.
Our research showcases that RDD can integrate with existing KD methods to improve their upper performance bound.
arXiv Detail & Related papers (2024-07-04T08:08:25Z) - Adaptive Teaching with Shared Classifier for Knowledge Distillation [6.03477652126575]
Knowledge distillation (KD) is a technique used to transfer knowledge from a teacher network to a student network.
We propose adaptive teaching with a shared classifier (ATSC)
Our approach achieves state-of-the-art results on the CIFAR-100 and ImageNet datasets in both single-teacher and multiteacher scenarios.
arXiv Detail & Related papers (2024-06-12T08:51:08Z) - AICSD: Adaptive Inter-Class Similarity Distillation for Semantic
Segmentation [12.92102548320001]
This paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation.
The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs.
Experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-08-08T13:17:20Z) - BD-KD: Balancing the Divergences for Online Knowledge Distillation [12.27903419909491]
We propose BD-KD: Balancing of Divergences for online Knowledge Distillation.
We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network.
We demonstrate that, by performing this balancing design at the level of the student distillation loss, we improve upon both performance accuracy and calibration of the compact student network.
arXiv Detail & Related papers (2022-12-25T22:27:32Z) - Learning Knowledge Representation with Meta Knowledge Distillation for
Single Image Super-Resolution [82.89021683451432]
We propose a model-agnostic meta knowledge distillation method under the teacher-student architecture for the single image super-resolution task.
Experiments conducted on various single image super-resolution datasets demonstrate that our proposed method outperforms existing defined knowledge representation related distillation methods.
arXiv Detail & Related papers (2022-07-18T02:41:04Z) - Parameter-Efficient and Student-Friendly Knowledge Distillation [83.56365548607863]
We present a parameter-efficient and student-friendly knowledge distillation method, namely PESF-KD, to achieve efficient and sufficient knowledge transfer.
Experiments on a variety of benchmarks show that PESF-KD can significantly reduce the training cost while obtaining competitive results compared to advanced online distillation methods.
arXiv Detail & Related papers (2022-05-28T16:11:49Z) - Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation
for Scene Recognition [64.29650787243443]
We propose and analyse the use of a 2D frequency transform of the activation maps before transferring them.
This strategy enhances knowledge transferability in tasks such as scene recognition.
We publicly release the training and evaluation framework used along this paper at http://www.vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.
arXiv Detail & Related papers (2022-05-04T11:05:18Z) - Local-Selective Feature Distillation for Single Image Super-Resolution [42.83228585332463]
We propose a novel feature distillation (FD) method which is suitable for single image super-resolution (SISR)
We show the limitations of the existing FitNet-based FD method that it suffers in the SISR task, and propose to modify the existing FD algorithm to focus on local feature information.
We call our method local-selective feature distillation (LSFD) and verify that our method outperforms conventional FD methods in SISR problems.
arXiv Detail & Related papers (2021-11-22T05:05:37Z) - Knowledge Distillation By Sparse Representation Matching [107.87219371697063]
We propose Sparse Representation Matching (SRM) to transfer intermediate knowledge from one Convolutional Network (CNN) to another by utilizing sparse representation.
We formulate as a neural processing block, which can be efficiently optimized using gradient descent and integrated into any CNN in a plug-and-play manner.
Our experiments demonstrate that is robust to architectural differences between the teacher and student networks, and outperforms other KD techniques across several datasets.
arXiv Detail & Related papers (2021-03-31T11:47:47Z) - Spirit Distillation: Precise Real-time Prediction with Insufficient Data [4.6247655021017655]
We propose a new training framework named Spirit Distillation(SD)
It extends the ideas of fine-tuning-based transfer learning(FTT) and feature-based knowledge distillation.
Results demonstrate the boosting performance in segmentation(mIOU) and high-precision accuracy boost by 1.4% and 8.2% respectively.
arXiv Detail & Related papers (2021-03-25T10:23:30Z) - Deep Adaptive Inference Networks for Single Image Super-Resolution [72.7304455761067]
Single image super-resolution (SISR) has witnessed tremendous progress in recent years owing to the deployment of deep convolutional neural networks (CNNs)
In this paper, we take a step forward to address this issue by leveraging the adaptive inference networks for deep SISR (AdaDSR)
Our AdaDSR involves an SISR model as backbone and a lightweight adapter module which takes image features and resource constraint as input and predicts a map of local network depth.
arXiv Detail & Related papers (2020-04-08T10:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.