Context-Free TextSpotter for Real-Time and Mobile End-to-End Text
Detection and Recognition
- URL: http://arxiv.org/abs/2106.05611v1
- Date: Thu, 10 Jun 2021 09:32:52 GMT
- Title: Context-Free TextSpotter for Real-Time and Mobile End-to-End Text
Detection and Recognition
- Authors: Ryota Yoshihashi, Tomohiro Tanaka, Kenji Doi, Takumi Fujino, and
Naoaki Yamashita
- Abstract summary: We propose a text-spotting method that consists of simple convolutions and a few post-processes, named Context-Free TextSpotter.
Experiments using standard benchmarks show that Context-Free TextSpotter achieves real-time text spotting on a GPU with only three million parameters, which is the smallest and fastest among existing deep text spotters.
Our text spotter can run on a smartphone with affordable latency, which is valuable for building stand-alone OCR applications.
- Score: 8.480710920894547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the deployment of scene-text spotting systems on mobile platforms,
lightweight models with low computation are preferable. In concept, end-to-end
(E2E) text spotting is suitable for such purposes because it performs text
detection and recognition in a single model. However, current state-of-the-art
E2E methods rely on heavy feature extractors, recurrent sequence modellings,
and complex shape aligners to pursue accuracy, which means their computations
are still heavy. We explore the opposite direction: How far can we go without
bells and whistles in E2E text spotting? To this end, we propose a
text-spotting method that consists of simple convolutions and a few
post-processes, named Context-Free TextSpotter. Experiments using standard
benchmarks show that Context-Free TextSpotter achieves real-time text spotting
on a GPU with only three million parameters, which is the smallest and fastest
among existing deep text spotters, with an acceptable transcription quality
degradation compared to heavier ones. Further, we demonstrate that our text
spotter can run on a smartphone with affordable latency, which is valuable for
building stand-alone OCR applications.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.