Related papers: Text Recognition in the Wild: A Survey

Text Recognition in the Wild: A Survey

URL: http://arxiv.org/abs/2005.03492v3
Date: Thu, 3 Dec 2020 07:06:27 GMT
Title: Text Recognition in the Wild: A Survey
Authors: Xiaoxue Chen, Lianwen Jin, Yuanzhi Zhu, Canjie Luo, and Tianwei Wang
Abstract summary: This literature review attempts to present the entire picture of the field of scene text recognition. It provides a comprehensive reference for people entering this field, and could be helpful to inspire future research.
Score: 33.22076515689926
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The history of text can be traced back over thousands of years. Rich and precise semantic information carried by text is important in a wide range of vision-based application scenarios. Therefore, text recognition in natural scenes has been an active research field in computer vision and pattern recognition. In recent years, with the rise and development of deep learning, numerous methods have shown promising in terms of innovation, practicality, and efficiency. This paper aims to (1) summarize the fundamental problems and the state-of-the-art associated with scene text recognition; (2) introduce new insights and ideas; (3) provide a comprehensive review of publicly available resources; (4) point out directions for future work. In summary, this literature review attempts to present the entire picture of the field of scene text recognition. It provides a comprehensive reference for people entering this field, and could be helpful to inspire future research. Related resources are available at our Github repository: https://github.com/HCIILAB/Scene-Text-Recognition.

Related papers

Visual Text Processing: A Comprehensive Review and Unified Evaluation [99.57846940547171]
We present a comprehensive, multi-perspective analysis of recent advancements in visual text processing. Our aim is to establish this work as a fundamental resource that fosters future exploration and innovation in the dynamic field of visual text processing.
arXiv Detail & Related papers (2025-04-30T14:19:29Z)
Towards Visual Grounding: A Survey [87.37662490666098]
Since 2021, visual grounding has witnessed significant advancements, with emerging new concepts such as grounded pre-training. This survey is designed to be suitable for both beginners and experienced researchers, serving as an invaluable resource for understanding key concepts and tracking the latest research developments.
arXiv Detail & Related papers (2024-12-28T16:34:35Z)
Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing [4.057550183467041]
The field of visual text processing has experienced a surge in research, driven by the advent of fundamental generative models. We present a comprehensive, multi-perspective analysis of recent advancements in this field.
arXiv Detail & Related papers (2024-02-05T15:13:20Z)
Automatic and Human-AI Interactive Text Generation [27.05024520190722]
This tutorial aims to provide an overview of the state-of-the-art natural language generation research. Text-to-text generation tasks are more constrained in terms of semantic consistency and targeted language styles.
arXiv Detail & Related papers (2023-10-05T20:26:15Z)
Deep Learning for Visual Speech Analysis: A Survey [54.53032361204449]
This paper presents a review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance.
arXiv Detail & Related papers (2022-05-22T14:44:53Z)
From Show to Tell: A Survey on Image Captioning [48.98681267347662]
Connecting Vision and Language plays an essential role in Generative Intelligence. Research in image captioning has not reached a conclusive answer yet. This work aims at providing a comprehensive overview and categorization of image captioning approaches.
arXiv Detail & Related papers (2021-07-14T18:00:54Z)
Deep Learning for Text Style Transfer: A Survey [71.8870854396927]
Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text. We present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data.
arXiv Detail & Related papers (2020-11-01T04:04:43Z)
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc. We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)
A Survey of Knowledge-Enhanced Text Generation [81.24633231919137]
The goal of text generation is to make machines express in human language. Various neural encoder-decoder models have been proposed to achieve the goal by learning to map input text to output text. To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models.
arXiv Detail & Related papers (2020-10-09T06:46:46Z)
Deep learning for scene recognition from visual data: a survey [2.580765958706854]
This work aims to be a review of the state-of-the-art in scene recognition with deep learning models from visual data. Scene recognition is still an emerging field in computer vision, which has been addressed from a single image and dynamic image perspective.
arXiv Detail & Related papers (2020-07-03T16:53:18Z)
On Vocabulary Reliance in Scene Text Recognition [79.21737876442253]
Methods perform well on images with words within vocabulary but generalize poorly to images with words outside vocabulary. We call this phenomenon "vocabulary reliance" We propose a simple yet effective mutual learning strategy to allow models of two families to learn collaboratively.
arXiv Detail & Related papers (2020-05-08T11:16:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.