Text Detection and Recognition in the Wild: A Review
- URL: http://arxiv.org/abs/2006.04305v2
- Date: Tue, 30 Jun 2020 22:23:08 GMT
- Title: Text Detection and Recognition in the Wild: A Review
- Authors: Zobeir Raisi, Mohamed A. Naiel, Paul Fieguth, Steven Wardell, and John
Zelek
- Abstract summary: State-of-the-art scene text detection and/or recognition methods have exploited the advancement in deep learning architectures.
The paper presents a review on the recent advancement in scene text detection and recognition.
It also identifies several existing challenges for detecting or recognizing text in the wild images.
- Score: 7.43788469020627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detection and recognition of text in natural images are two main problems in
the field of computer vision that have a wide variety of applications in
analysis of sports videos, autonomous driving, industrial automation, to name a
few. They face common challenging problems that are factors in how text is
represented and affected by several environmental conditions. The current
state-of-the-art scene text detection and/or recognition methods have exploited
the witnessed advancement in deep learning architectures and reported a
superior accuracy on benchmark datasets when tackling multi-resolution and
multi-oriented text. However, there are still several remaining challenges
affecting text in the wild images that cause existing methods to underperform
due to there models are not able to generalize to unseen data and the
insufficient labeled data. Thus, unlike previous surveys in this field, the
objectives of this survey are as follows: first, offering the reader not only a
review on the recent advancement in scene text detection and recognition, but
also presenting the results of conducting extensive experiments using a unified
evaluation framework that assesses pre-trained models of the selected methods
on challenging cases, and applies the same evaluation criteria on these
techniques. Second, identifying several existing challenges for detecting or
recognizing text in the wild images, namely, in-plane-rotation, multi-oriented
and multi-resolution text, perspective distortion, illumination reflection,
partial occlusion, complex fonts, and special characters. Finally, the paper
also presents insight into the potential research directions in this field to
address some of the mentioned challenges that are still encountering scene text
detection and recognition techniques.
Related papers
- MOoSE: Multi-Orientation Sharing Experts for Open-set Scene Text Recognition [3.6227230205444902]
Open-set text recognition aims to address both novel characters and previously seen ones.
We first propose a Multi-Oriented Open-Set Text Recognition task (MOOSTR) to model the challenges of both novel characters and writing direction variety.
We then propose a Multi-Orientation Sharing Experts (MOoSE) framework as a strong baseline solution.
arXiv Detail & Related papers (2024-07-26T09:20:29Z) - Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual
Text Processing [4.057550183467041]
The field of visual text processing has experienced a surge in research, driven by the advent of fundamental generative models.
We present a comprehensive, multi-perspective analysis of recent advancements in this field.
arXiv Detail & Related papers (2024-02-05T15:13:20Z) - Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors [57.7003399760813]
We explore advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways.
We uncover a significant correlation between topics and detection performance.
These investigations shed light on the adaptability and robustness of these detection methods across diverse topics.
arXiv Detail & Related papers (2023-12-20T10:53:53Z) - Watermarking Conditional Text Generation for AI Detection: Unveiling
Challenges and a Semantic-Aware Watermark Remedy [52.765898203824975]
We introduce a semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context.
Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models.
arXiv Detail & Related papers (2023-07-25T20:24:22Z) - MAGE: Machine-generated Text Detection in the Wild [82.70561073277801]
Large language models (LLMs) have achieved human-level text generation, emphasizing the need for effective AI-generated text detection.
We build a comprehensive testbed by gathering texts from diverse human writings and texts generated by different LLMs.
Despite challenges, the top-performing detector can identify 86.54% out-of-domain texts generated by a new LLM, indicating the feasibility for application scenarios.
arXiv Detail & Related papers (2023-05-22T17:13:29Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - Deep Learning Approaches on Image Captioning: A Review [0.5852077003870417]
Image captioning aims to generate natural language descriptions for visual content in the form of still images.
Deep learning and vision-language pre-training techniques have revolutionized the field, leading to more sophisticated methods and improved performance.
We address the challenges faced in this field by emphasizing issues such as object hallucination, missing context, illumination conditions, contextual understanding, and referring expressions.
We identify several potential future directions for research in this area, which include tackling the information misalignment problem between image and text modalities, mitigating dataset bias, incorporating vision-language pre-training methods to enhance caption generation, and developing improved evaluation tools to accurately
arXiv Detail & Related papers (2022-01-31T00:39:37Z) - Text-Aware Single Image Specular Highlight Removal [14.624958411229862]
Existing methods typically remove specular highlight for medical images and specific-object images, however, they cannot handle the images with text.
In this paper, we first raise and study the text-aware single image specular highlight removal problem.
The core goal is to improve the accuracy of text detection and recognition by removing the highlight from text images.
arXiv Detail & Related papers (2021-08-16T03:51:53Z) - MOST: A Multi-Oriented Scene Text Detector with Localization Refinement [67.35280008722255]
We propose a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization.
Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features.
A Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to exclude unreliable ones.
arXiv Detail & Related papers (2021-04-02T14:34:41Z) - Text Recognition in Real Scenarios with a Few Labeled Samples [55.07859517380136]
Scene text recognition (STR) is still a hot research topic in computer vision field.
This paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation.
Our approach can maximize the character-level confusion between the source domain and the target domain.
arXiv Detail & Related papers (2020-06-22T13:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.