Unconstrained Text Detection in Manga
- URL: http://arxiv.org/abs/2010.03997v1
- Date: Wed, 7 Oct 2020 13:28:13 GMT
- Title: Unconstrained Text Detection in Manga
- Authors: Juli\'an Del Gobbo, Rosana Matuk Herrera
- Abstract summary: This work aims to identify text characters at a pixel level in a comic genre with highly sophisticated text styles: Japanese manga.
Most of the literature in text detection use bounding box metrics, which are unsuitable for pixel-level evaluation.
Using these resources, we designed and evaluated a deep network model, outperforming current methods for text detection in manga in most metrics.
- Score: 3.04585143845864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The detection and recognition of unconstrained text is an open problem in
research. Text in comic books has unusual styles that raise many challenges for
text detection. This work aims to identify text characters at a pixel level in
a comic genre with highly sophisticated text styles: Japanese manga. To
overcome the lack of a manga dataset with individual character level
annotations, we create our own. Most of the literature in text detection use
bounding box metrics, which are unsuitable for pixel-level evaluation. Thus, we
implemented special metrics to evaluate performance. Using these resources, we
designed and evaluated a deep network model, outperforming current methods for
text detection in manga in most metrics.
Related papers
- KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark [1.5409800688911346]
We introduce the first Khmer scene-text dataset, featuring 1,544 expert-annotated images.
This diverse dataset includes flat text, raised text, poorly illuminated text, distant polygon and partially obscured text.
arXiv Detail & Related papers (2024-10-23T21:04:24Z) - The Manga Whisperer: Automatically Generating Transcriptions for Comics [55.544015596503726]
We present a unified model, Magi, that is able to detect panels, text boxes and character boxes.
We propose a novel approach that is able to sort the detected text boxes in their reading order and generate a dialogue transcript.
arXiv Detail & Related papers (2024-01-18T18:59:09Z) - Enhancing Scene Text Detectors with Realistic Text Image Synthesis Using
Diffusion Models [63.99110667987318]
We present DiffText, a pipeline that seamlessly blends foreground text with the background's intrinsic features.
With fewer text instances, our produced text images consistently surpass other synthetic data in aiding text detectors.
arXiv Detail & Related papers (2023-11-28T06:51:28Z) - SpaText: Spatio-Textual Representation for Controllable Image Generation [61.89548017729586]
SpaText is a new method for text-to-image generation using open-vocabulary scene control.
In addition to a global text prompt that describes the entire scene, the user provides a segmentation map.
We show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-conditional-based.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - Toward Understanding WordArt: Corner-Guided Transformer for Scene Text
Recognition [63.6608759501803]
We propose to recognize artistic text at three levels.
corner points are applied to guide the extraction of local features inside characters, considering the robustness of corner structures to appearance and shape.
Secondly, we design a character contrastive loss to model the character-level feature, improving the feature representation for character classification.
Thirdly, we utilize Transformer to learn the global feature on image-level and model the global relationship of the corner points.
arXiv Detail & Related papers (2022-07-31T14:11:05Z) - Detection of Furigana Text in Images [1.77898701462905]
Furigana are pronunciation notes used in Japanese writing.
Being able to detect these can help improve optical character recognition (OCR) performance.
This project focuses on detecting furigana in Japanese books and comics.
arXiv Detail & Related papers (2022-07-08T15:27:19Z) - Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution [31.88960656995447]
We propose a Stroke-Aware Scene Text Image Super-Resolution method containing a Stroke-Focused Module (SFM) to concentrate on stroke-level internal structures of characters in text images.
Specifically, we attempt to design rules for decomposing English characters and digits at stroke-level, then pre-train a text recognizer to provide stroke-level attention maps as positional clues.
The proposed method can indeed generate more distinguishable images on TextZoom and manually constructed Chinese character dataset Degraded-IC13.
arXiv Detail & Related papers (2021-12-13T15:26:10Z) - Scene Text Detection with Scribble Lines [59.698806258671105]
We propose to annotate texts by scribble lines instead of polygons for text detection.
It is a general labeling method for texts with various shapes and requires low labeling costs.
Experiments show that the proposed method bridges the performance gap between the weakly labeling method and the original polygon-based labeling methods.
arXiv Detail & Related papers (2020-12-09T13:14:53Z) - Unconstrained Text Detection in Manga: a New Dataset and Baseline [3.04585143845864]
This work aims to binarize text in a comic genre with highly sophisticated text styles: Japanese manga.
To overcome the lack of a manga dataset with text annotations at a pixel level, we create our own.
Using these resources, we designed and evaluated a deep network model, outperforming current methods for text binarization in manga in most metrics.
arXiv Detail & Related papers (2020-09-09T00:16:51Z) - AE TextSpotter: Learning Visual and Linguistic Representation for
Ambiguous Text Spotting [98.08853679310603]
This work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter)
AE TextSpotter learns both visual and linguistic features to significantly reduce ambiguity in text detection.
To our knowledge, it is the first time to improve text detection by using a language model.
arXiv Detail & Related papers (2020-08-03T08:40:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.