Combining Morphological and Histogram based Text Line Segmentation in
the OCR Context
- URL: http://arxiv.org/abs/2103.08922v1
- Date: Tue, 16 Mar 2021 09:06:25 GMT
- Title: Combining Morphological and Histogram based Text Line Segmentation in
the OCR Context
- Authors: Pit Schneider
- Abstract summary: Algorithmic approach proposed by this paper has been designed for this exact purpose.
The method was developed to be applied on a historic data collection that commonly features quality issues.
Because of the promising segmentation results that are joined by low computational cost, the algorithm was incorporated into the OCR pipeline of the National Library of Luxembourg.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text line segmentation is one of the pre-stages of modern optical character
recognition systems. The algorithmic approach proposed by this paper has been
designed for this exact purpose. Its main characteristic is the combination of
two different techniques, morphological image operations and horizontal
histogram projections. The method was developed to be applied on a historic
data collection that commonly features quality issues, such as degraded paper,
blurred text, or curved text lines. For that reason, the segmenter in question
could be of particular interest for cultural institutions, such as libraries,
archives, museums, ..., that want access to robust line bounding boxes for a
given historic document. Because of the promising segmentation results that are
joined by low computational cost, the algorithm was incorporated into the OCR
pipeline of the National Library of Luxembourg, in the context of the
initiative of reprocessing their historic newspaper collection. The general
contribution of this paper is to outline the approach and to evaluate the gains
in terms of accuracy and speed, comparing it to the segmentation algorithm
bundled with the used open source OCR software.
Related papers
- SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection [10.08588082910962]
Text line detection is a key task in historical document analysis.
We propose a general framework for historical document text detection (SegHist)
Integrating the SegHist framework with the commonly used method DB++, we develop DB-SegHist.
arXiv Detail & Related papers (2024-06-17T11:00:04Z) - The CLRS-Text Algorithmic Reasoning Language Benchmark [48.45201665463275]
CLRS-Text is a textual version of the CLRS benchmark.
CLRS-Text is capable of procedurally generating trace data for thirty diverse, challenging algorithmic tasks.
We fine-tune and evaluate various LMs as generalist executors on this benchmark.
arXiv Detail & Related papers (2024-06-06T16:29:25Z) - From Text Segmentation to Smart Chaptering: A Novel Benchmark for
Structuring Video Transcriptions [63.11097464396147]
We introduce a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse.
We also introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-27T15:59:37Z) - Segmenting Messy Text: Detecting Boundaries in Text Derived from
Historical Newspaper Images [0.0]
We consider a challenging text segmentation task: dividing newspaper marriage announcement lists into units of one announcement each.
In many cases the information is not structured into sentences, and adjacent segments are not topically distinct from each other.
We present a novel deep learning-based model for segmenting such text and show that it significantly outperforms an existing state-of-the-art method on our task.
arXiv Detail & Related papers (2023-12-20T05:17:06Z) - Optimization of Image Processing Algorithms for Character Recognition in
Cultural Typewritten Documents [0.8158530638728501]
This paper evaluates the impact of image processing methods and parameter tuning in Optical Character Recognition (OCR)
The approach uses a multi-objective problem formulation to minimize Levenshtein edit distance and maximize the number of words correctly identified with a non-dominated sorting genetic algorithm (NSGA-II)
Our findings suggest that employing image pre-processing algorithms in OCR might be more suitable for typologies where the text recognition task without pre-processing does not produce good results.
arXiv Detail & Related papers (2023-11-27T11:44:46Z) - CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation [56.58365347854647]
We introduce a novel cost-based approach to adapt vision-language foundation models, notably CLIP.
Our method potently adapts CLIP for segmenting seen and unseen classes by fine-tuning its encoders.
arXiv Detail & Related papers (2023-03-21T12:28:21Z) - One-shot Compositional Data Generation for Low Resource Handwritten Text
Recognition [10.473427493876422]
Low resource Handwritten Text Recognition is a hard problem due to the scarce annotated data and the very limited linguistic information.
In this paper we address this problem through a data generation technique based on Bayesian Program Learning.
Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet.
arXiv Detail & Related papers (2021-05-11T18:53:01Z) - Rethinking Text Line Recognition Models [57.47147190119394]
We consider two decoder families (Connectionist Temporal Classification and Transformer) and three encoder modules (Bidirectional LSTMs, Self-Attention, and GRCLs)
We compare their accuracy and performance on widely used public datasets of scene and handwritten text.
Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length.
arXiv Detail & Related papers (2021-04-15T21:43:13Z) - SCATTER: Selective Context Attentional Scene Text Recognizer [16.311256552979835]
Scene Text Recognition (STR) is the task of recognizing text against complex image backgrounds.
Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes.
We introduce a novel architecture for STR, named Selective Context ATtentional Text Recognizer (SCATTER)
arXiv Detail & Related papers (2020-03-25T09:20:28Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z) - TextScanner: Reading Characters in Order for Robust Scene Text
Recognition [60.04267660533966]
TextScanner is an alternative approach for scene text recognition.
It generates pixel-wise, multi-channel segmentation maps for character class, position and order.
It also adopts RNN for context modeling and performs paralleled prediction for character position and class.
arXiv Detail & Related papers (2019-12-28T07:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.