Text Recognition in the Wild: A Survey
- URL: http://arxiv.org/abs/2005.03492v3
- Date: Thu, 3 Dec 2020 07:06:27 GMT
- Title: Text Recognition in the Wild: A Survey
- Authors: Xiaoxue Chen, Lianwen Jin, Yuanzhi Zhu, Canjie Luo, and Tianwei Wang
- Abstract summary: This literature review attempts to present the entire picture of the field of scene text recognition.
It provides a comprehensive reference for people entering this field, and could be helpful to inspire future research.
- Score: 33.22076515689926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The history of text can be traced back over thousands of years. Rich and
precise semantic information carried by text is important in a wide range of
vision-based application scenarios. Therefore, text recognition in natural
scenes has been an active research field in computer vision and pattern
recognition. In recent years, with the rise and development of deep learning,
numerous methods have shown promising in terms of innovation, practicality, and
efficiency. This paper aims to (1) summarize the fundamental problems and the
state-of-the-art associated with scene text recognition; (2) introduce new
insights and ideas; (3) provide a comprehensive review of publicly available
resources; (4) point out directions for future work. In summary, this
literature review attempts to present the entire picture of the field of scene
text recognition. It provides a comprehensive reference for people entering
this field, and could be helpful to inspire future research. Related resources
are available at our Github repository:
https://github.com/HCIILAB/Scene-Text-Recognition.
Related papers
- Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual
Text Processing [4.057550183467041]
The field of visual text processing has experienced a surge in research, driven by the advent of fundamental generative models.
We present a comprehensive, multi-perspective analysis of recent advancements in this field.
arXiv Detail & Related papers (2024-02-05T15:13:20Z) - Automatic and Human-AI Interactive Text Generation [27.05024520190722]
This tutorial aims to provide an overview of the state-of-the-art natural language generation research.
Text-to-text generation tasks are more constrained in terms of semantic consistency and targeted language styles.
arXiv Detail & Related papers (2023-10-05T20:26:15Z) - Deep Learning for Visual Speech Analysis: A Survey [54.53032361204449]
This paper presents a review of recent progress in deep learning methods on visual speech analysis.
We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance.
arXiv Detail & Related papers (2022-05-22T14:44:53Z) - From Show to Tell: A Survey on Image Captioning [48.98681267347662]
Connecting Vision and Language plays an essential role in Generative Intelligence.
Research in image captioning has not reached a conclusive answer yet.
This work aims at providing a comprehensive overview and categorization of image captioning approaches.
arXiv Detail & Related papers (2021-07-14T18:00:54Z) - Deep Learning for Text Style Transfer: A Survey [71.8870854396927]
Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text.
We present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017.
We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data.
arXiv Detail & Related papers (2020-11-01T04:04:43Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z) - A Survey of Knowledge-Enhanced Text Generation [81.24633231919137]
The goal of text generation is to make machines express in human language.
Various neural encoder-decoder models have been proposed to achieve the goal by learning to map input text to output text.
To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models.
arXiv Detail & Related papers (2020-10-09T06:46:46Z) - Deep learning for scene recognition from visual data: a survey [2.580765958706854]
This work aims to be a review of the state-of-the-art in scene recognition with deep learning models from visual data.
Scene recognition is still an emerging field in computer vision, which has been addressed from a single image and dynamic image perspective.
arXiv Detail & Related papers (2020-07-03T16:53:18Z) - On Vocabulary Reliance in Scene Text Recognition [79.21737876442253]
Methods perform well on images with words within vocabulary but generalize poorly to images with words outside vocabulary.
We call this phenomenon "vocabulary reliance"
We propose a simple yet effective mutual learning strategy to allow models of two families to learn collaboratively.
arXiv Detail & Related papers (2020-05-08T11:16:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.