Related papers: Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition

Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition

URL: http://arxiv.org/abs/2112.09532v2
Date: Wed, 10 Jul 2024 12:49:41 GMT
Title: Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition
Authors: Guangyu Guo, Dingwen Zhang, Longfei Han, Nian Liu, Ming-Ming Cheng, Junwei Han,
Abstract summary: We propose Pixel Distillation that extends knowledge distillation into the input level while simultaneously breaking architecture constraints. Such a scheme can achieve flexible cost control for deployment, as it allows the system to adjust both network architecture and image quality according to the overall requirement of resources.
Score: 124.80263629921498
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Previous knowledge distillation (KD) methods mostly focus on compressing network architectures, which is not thorough enough in deployment as some costs like transmission bandwidth and imaging equipment are related to the image size. Therefore, we propose Pixel Distillation that extends knowledge distillation into the input level while simultaneously breaking architecture constraints. Such a scheme can achieve flexible cost control for deployment, as it allows the system to adjust both network architecture and image quality according to the overall requirement of resources. Specifically, we first propose an input spatial representation distillation (ISRD) mechanism to transfer spatial knowledge from large images to student's input module, which can facilitate stable knowledge transfer between CNN and ViT. Then, a Teacher-Assistant-Student (TAS) framework is further established to disentangle pixel distillation into the model compression stage and input compression stage, which significantly reduces the overall complexity of pixel distillation and the difficulty of distilling intermediate knowledge. Finally, we adapt pixel distillation to object detection via an aligned feature for preservation (AFP) strategy for TAS, which aligns output dimensions of detectors at each stage by manipulating features and anchors of the assistant. Comprehensive experiments on image classification and object detection demonstrate the effectiveness of our method. Code is available at https://github.com/gyguo/PixelDistillation.

Related papers

Soft Knowledge Distillation with Multi-Dimensional Cross-Net Attention for Image Restoration Models Compression [0.0]
Transformer-based encoder-decoder models have achieved remarkable success in image-to-image transfer tasks. However, their high computational complexity-manifested in elevated FLOPs and parameter counts-limits their application in real-world scenarios. We propose a Soft Knowledge Distillation (SKD) strategy that incorporates a Multi-dimensional Cross-net Attention (MCA) mechanism for compressing image restoration models.
arXiv Detail & Related papers (2025-01-16T06:25:56Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
Resource Efficient Perception for Vision Systems [0.0]
Our study introduces a framework aimed at mitigating these challenges by leveraging memory efficient patch based processing for high resolution images. It incorporates a global context representation alongside local patch information, enabling a comprehensive understanding of the image content. We demonstrate the effectiveness of our method through superior performance on 7 different benchmarks across classification, object detection, and segmentation.
arXiv Detail & Related papers (2024-05-12T05:33:00Z)
EPNet: An Efficient Pyramid Network for Enhanced Single-Image Super-Resolution with Reduced Computational Requirements [12.439807086123983]
Single-image super-resolution (SISR) has seen significant advancements through the integration of deep learning. This paper introduces a new Efficient Pyramid Network (EPNet) that harmoniously merges an Edge Split Pyramid Module (ESPM) with a Panoramic Feature Extraction Module (PFEM) to overcome the limitations of existing methods.
arXiv Detail & Related papers (2023-12-20T19:56:53Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers [3.8831062015253055]
We introduce a Single-Image Super-Resolution (SISR) approach to enhance the detection of structural and textural features in surveillance images. Our approach incorporates sub-pixel convolution layers and a loss function that uses an Optical Character Recognition (OCR) model for feature extraction. Our results show that our approach for reconstructing these low-resolution synthesized images outperforms existing ones in both quantitative and qualitative measures.
arXiv Detail & Related papers (2023-05-27T00:17:19Z)
Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels) We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA. By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z)
Rich Feature Distillation with Feature Affinity Module for Efficient Image Dehazing [1.1470070927586016]
This work introduces a simple, lightweight, and efficient framework for single-image haze removal. We exploit rich "dark-knowledge" information from a lightweight pre-trained super-resolution model via the notion of heterogeneous knowledge distillation. Our experiments are carried out on the RESIDE-Standard dataset to demonstrate the robustness of our framework to the synthetic and real-world domains.
arXiv Detail & Related papers (2022-07-13T18:32:44Z)
Rank-Enhanced Low-Dimensional Convolution Set for Hyperspectral Image Denoising [50.039949798156826]
This paper tackles the challenging problem of hyperspectral (HS) image denoising. We propose rank-enhanced low-dimensional convolution set (Re-ConvSet) We then incorporate Re-ConvSet into the widely-used U-Net architecture to construct an HS image denoising method.
arXiv Detail & Related papers (2022-07-09T13:35:12Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.