Related papers: Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

URL: http://arxiv.org/abs/2404.02573v1
Date: Wed, 3 Apr 2024 08:47:40 GMT
Title: Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution
Authors: Simiao Li, Yun Zhang, Wei Li, Hanting Chen, Wenjia Wang, Bingyi Jing, Shaohui Lin, Jie Hu,
Abstract summary: This work presents MiDPK, a multi-granularity mixture of prior KDPK framework, to facilitate efficient image super-resolution model. Experiments demonstrate the effectiveness of the proposed MiDPK method.
Score: 25.558550480342614
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Knowledge distillation (KD) is a promising yet challenging model compression technique that transfers rich learning representations from a well-performing but cumbersome teacher model to a compact student model. Previous methods for image super-resolution (SR) mostly compare the feature maps directly or after standardizing the dimensions with basic algebraic operations (e.g. average, dot-product). However, the intrinsic semantic differences among feature maps are overlooked, which are caused by the disparate expressive capacity between the networks. This work presents MiPKD, a multi-granularity mixture of prior KD framework, to facilitate efficient SR model through the feature mixture in a unified latent space and stochastic network block mixture. Extensive experiments demonstrate the effectiveness of the proposed MiPKD method.

Related papers

Is Contrastive Distillation Enough for Learning Comprehensive 3D Representations? [55.99654128127689]
Cross-modal contrastive distillation has recently been explored for learning effective 3D representations. Existing methods focus primarily on modality-shared features, neglecting the modality-specific features during the pre-training process. We propose a new framework, namely CMCR, to address these shortcomings.
arXiv Detail & Related papers (2024-12-12T06:09:49Z)
Active Data Curation Effectively Distills Large-Scale Multimodal Models [66.23057263509027]
Knowledge distillation (KD) is the de facto standard for compressing large-scale models into smaller ones. In this work we explore an alternative, yet simple approach -- active data curation as effective distillation for contrastive multimodal pretraining. Our simple online batch selection method, ACID, outperforms strong KD baselines across various model-, data- and compute-configurations.
arXiv Detail & Related papers (2024-11-27T18:50:15Z)
TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant [52.0297393822012]
We introduce an assistant model as a bridge to facilitate smooth feature knowledge transfer between heterogeneous teachers and students. Within our proposed design principle, the assistant model combines the advantages of cross-architecture inductive biases and module functions. Our proposed method is evaluated across some homogeneous model pairs and arbitrary heterogeneous combinations of CNNs, ViTs, spatial KDs.
arXiv Detail & Related papers (2024-10-16T08:02:49Z)
Learning Diffusion Model from Noisy Measurement using Principled Expectation-Maximization Method [9.173055778539641]
We propose a principled expectation-maximization (EM) framework that iteratively learns diffusion models from noisy data with arbitrary corruption types. Our framework employs a plug-and-play Monte Carlo method to accurately estimate clean images from noisy measurements, followed by training the diffusion model using the reconstructed images.
arXiv Detail & Related papers (2024-10-15T03:54:59Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution [6.983043882738687]
We propose a novel Multi-Teacher Knowledge Distillation (MTKD) framework specifically for image super-resolution. It exploits the advantages of multiple teachers by combining and enhancing the outputs of these teacher models. We fully evaluate the effectiveness of the proposed method by comparing it to five commonly used KD methods for image super-resolution.
arXiv Detail & Related papers (2024-04-15T08:32:41Z)
Deep Unfolding Convolutional Dictionary Model for Multi-Contrast MRI Super-resolution and Reconstruction [23.779641808300596]
We propose a multi-contrast convolutional dictionary (MC-CDic) model under the guidance of the optimization algorithm. We employ the proximal gradient algorithm to optimize the model and unroll the iterative steps into a deep CDic model. Experimental results demonstrate the superior performance of the proposed MC-CDic model against existing SOTA methods.
arXiv Detail & Related papers (2023-09-03T13:18:59Z)
Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance. We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z)
Dense Depth Distillation with Out-of-Distribution Simulated Images [30.79756881887895]
We study data-free knowledge distillation (KD) for monocular depth estimation (MDE) KD learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain. We show that our method outperforms the baseline KD by a good margin and even slightly better performance with as few as 1/6 of training images.
arXiv Detail & Related papers (2022-08-26T07:10:01Z)
Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR) Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data. We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)
Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)
Collaborative Distillation for Ultra-Resolution Universal Style Transfer [71.18194557949634]
We present a new knowledge distillation method (named Collaborative Distillation) for encoder-decoder based neural style transfer. We achieve ultra-resolution (over 40 megapixels) universal style transfer on a 12GB GPU for the first time.
arXiv Detail & Related papers (2020-03-18T18:59:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.