Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text
Spotting
- URL: http://arxiv.org/abs/2207.06694v2
- Date: Fri, 15 Jul 2022 01:59:00 GMT
- Title: Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text
Spotting
- Authors: Ying Chen, Liang Qiao, Zhanzhan Cheng, Shiliang Pu, Yi Niu and Xi Li
- Abstract summary: We propose a novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting framework.
It aims to infer images in different small but recognizable resolutions and achieve a better balance between accuracy and efficiency.
The proposed method can be optimized end-to-end and adopted in any current text spotting framework to improve the practicability.
- Score: 49.33891486324731
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: End-to-end text spotting has attached great attention recently due to its
benefits on global optimization and high maintainability for real applications.
However, the input scale has always been a tough trade-off since recognizing a
small text instance usually requires enlarging the whole image, which brings
high computational costs. In this paper, to address this problem, we propose a
novel cost-efficient Dynamic Low-resolution Distillation (DLD) text spotting
framework, which aims to infer images in different small but recognizable
resolutions and achieve a better balance between accuracy and efficiency.
Concretely, we adopt a resolution selector to dynamically decide the input
resolutions for different images, which is constraint by both inference
accuracy and computational cost. Another sequential knowledge distillation
strategy is conducted on the text recognition branch, making the low-res input
obtains comparable performance to a high-res image. The proposed method can be
optimized end-to-end and adopted in any current text spotting framework to
improve the practicability. Extensive experiments on several text spotting
benchmarks show that the proposed method vastly improves the usability of
low-res models. The code is available at
https://github.com/hikopensource/DAVAR-Lab-OCR/.
Related papers
- DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration.
The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z) - Text-guided Explorable Image Super-resolution [14.83045604603449]
We propose two approaches for zero-shot text-guided super-resolution.
We show that the proposed approaches result in diverse solutions that match the semantic meaning provided by the text prompt.
arXiv Detail & Related papers (2024-03-02T08:10:54Z) - One-stage Low-resolution Text Recognition with High-resolution Knowledge
Transfer [53.02254290682613]
Current solutions for low-resolution text recognition typically rely on a two-stage pipeline.
We propose an efficient and effective knowledge distillation framework to achieve multi-level knowledge transfer.
Experiments show that the proposed one-stage pipeline significantly outperforms super-resolution based two-stage frameworks.
arXiv Detail & Related papers (2023-08-05T02:33:45Z) - ESTISR: Adapting Efficient Scene Text Image Super-resolution for
Real-Scenes [25.04435367653037]
Scene text image super-resolution (STISR) has yielded remarkable improvements in accurately recognizing scene text.
We propose a novel Efficient Scene Text Image Super-resolution (ESTISR) Network for resource-limited deployment platform.
ESTISR consistently outperforms current methods in terms of actual running time and peak memory consumption.
arXiv Detail & Related papers (2023-06-04T19:14:44Z) - Rethinking Resolution in the Context of Efficient Video Recognition [49.957690643214576]
Cross-resolution KD (ResKD) is a simple but effective method to boost recognition accuracy on low-resolution frames.
We extensively demonstrate its effectiveness over state-of-the-art architectures, i.e., 3D-CNNs and Video Transformers.
arXiv Detail & Related papers (2022-09-26T15:50:44Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Single Image Internal Distribution Measurement Using Non-Local
Variational Autoencoder [11.985083962982909]
This paper proposes a novel image-specific solution, namely non-local variational autoencoder (textttNLVAE)
textttNLVAE is introduced as a self-supervised strategy that reconstructs high-resolution images using disentangled information from the non-local neighbourhood.
Experimental results from seven benchmark datasets demonstrate the effectiveness of the textttNLVAE model.
arXiv Detail & Related papers (2022-04-02T18:43:55Z) - High Quality Segmentation for Ultra High-resolution Images [72.97958314291648]
We propose the Continuous Refinement Model for the ultra high-resolution segmentation refinement task.
Our proposed method is fast and effective on image segmentation refinement.
arXiv Detail & Related papers (2021-11-29T11:53:06Z) - Scene Text Image Super-Resolution in the Wild [112.90416737357141]
Low-resolution text images are often seen in natural scenes such as documents captured by mobile phones.
Previous single image super-resolution (SISR) methods are trained on synthetic low-resolution images.
We pro-pose a real scene text SR dataset, termed TextZoom.
It contains paired real low-resolution and high-resolution images captured by cameras with different focal length in the wild.
arXiv Detail & Related papers (2020-05-07T09:18:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.