Combining OCR Models for Reading Early Modern Printed Books
- URL: http://arxiv.org/abs/2305.07131v1
- Date: Thu, 11 May 2023 20:43:50 GMT
- Title: Combining OCR Models for Reading Early Modern Printed Books
- Authors: Mathias Seuret, Janne van der Loop, Nikolaus Weichselbaumer, Martin
Mayr, Janina Molnar, Tatjana Hass, Florian Kordon, Anguelos Nicolau, Vincent
Christlein
- Abstract summary: We study the usage of fine-grained font recognition on OCR for books printed from the 15th to the 18th century.
We show that OCR performance is strongly impacted by font style and that selecting fine-tuned models with font group recognition has a very positive impact on the results.
- Score: 2.839401411131008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the usage of fine-grained font recognition on
OCR for books printed from the 15th to the 18th century. We used a newly
created dataset for OCR of early printed books for which fonts are labeled with
bounding boxes. We know not only the font group used for each character, but
the locations of font changes as well. In books of this period, we frequently
find font group changes mid-line or even mid-word that indicate changes in
language. We consider 8 different font groups present in our corpus and
investigate 13 different subsets: the whole dataset and text lines with a
single font, multiple fonts, Roman fonts, Gothic fonts, and each of the
considered fonts, respectively. We show that OCR performance is strongly
impacted by font style and that selecting fine-tuned models with font group
recognition has a very positive impact on the results. Moreover, we developed a
system using local font group recognition in order to combine the output of
multiple font recognition models, and show that while slower, this approach
performs better not only on text lines composed of multiple fonts but on the
ones containing a single font only as well.
Related papers
- VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and
Quantization [52.870638830417]
We propose a VQGAN-based framework (i.e., VQ-Font) to enhance glyph fidelity through token prior refinement and structure-aware enhancement.
Specifically, we pre-train a VQGAN to encapsulate font token prior within a codebook. Subsequently, VQ-Font refines the synthesized glyphs with the codebook to eliminate the domain gap between synthesized and real-world strokes.
arXiv Detail & Related papers (2023-08-27T06:32:20Z) - CF-Font: Content Fusion for Few-shot Font Generation [63.79915037830131]
We propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts.
Our method also allows to optimize the style representation vector of reference images.
We have evaluated our method on a dataset of 300 fonts with 6.5k characters each.
arXiv Detail & Related papers (2023-03-24T14:18:40Z) - Diff-Font: Diffusion Model for Robust One-Shot Font Generation [110.45944936952309]
We propose a novel one-shot font generation method based on a diffusion model, named Diff-Font.
The proposed model aims to generate the entire font library by giving only one sample as the reference.
The well-trained Diff-Font is not only robust to font gap and font variation, but also achieved promising performance on difficult character generation.
arXiv Detail & Related papers (2022-12-12T13:51:50Z) - Font Representation Learning via Paired-glyph Matching [15.358456947574913]
We propose a novel font representation learning scheme to embed font styles into the latent space.
For the discriminative representation of a font from others, we propose a paired-glyph matching-based font representation learning model.
We show our font representation learning scheme achieves better generalization performance than the existing font representation learning techniques.
arXiv Detail & Related papers (2022-11-20T12:27:27Z) - Few-Shot Font Generation by Learning Fine-Grained Local Styles [90.39288370855115]
Few-shot font generation (FFG) aims to generate a new font with a few examples.
We propose a new font generation approach by learning 1) the fine-grained local styles from references, and 2) the spatial correspondence between the content and reference glyphs.
arXiv Detail & Related papers (2022-05-20T05:07:05Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z) - A Multi-Implicit Neural Representation for Fonts [79.6123184198301]
font-specific discontinuities like edges and corners are difficult to represent using neural networks.
We introduce textitmulti-implicits to represent fonts as a permutation-in set of learned implict functions, without losing features.
arXiv Detail & Related papers (2021-06-12T21:40:11Z) - AdaptiFont: Increasing Individuals' Reading Speed with a Generative Font
Model and Bayesian Optimization [3.480626767752489]
AdaptiFont is a human-in-the-loop system aimed at interactively increasing readability of text displayed on a monitor.
We generate new true-type-fonts through active learning, render texts with the new font, and measure individual users' reading speed.
The results of a user study show that this adaptive font generation system finds regions in the font space corresponding to high reading speeds, that these fonts significantly increase participants' reading speed, and that the found fonts are significantly different across individual readers.
arXiv Detail & Related papers (2021-04-21T19:56:28Z) - FONTNET: On-Device Font Understanding and Prediction Pipeline [1.5749416770494706]
We propose two engines: Font Detection Engine and Font Prediction Engine.
We develop a novel CNN architecture for identifying font style of text in images.
Second, we designed a novel algorithm for predicting similar fonts for a given query font.
Third, we have optimized and deployed the entire engine On-Device which ensures privacy and improves latency in real time applications such as instant messaging.
arXiv Detail & Related papers (2021-03-30T08:11:24Z) - Few-shot Compositional Font Generation with Dual Memory [16.967987801167514]
We propose a novel font generation framework, named Dual Memory-augmented Font Generation Network (DM-Font)
We employ memory components and global-context awareness in the generator to take advantage of the compositionality.
In the experiments on Korean-handwriting fonts and Thai-printing fonts, we observe that our method generates a significantly better quality of samples with faithful stylization.
arXiv Detail & Related papers (2020-05-21T08:13:40Z) - Character-independent font identification [11.86456063377268]
We propose a method of determining if any two characters are from the same font or not.
We use a Convolutional Neural Network (CNN) trained with various font image pairs.
We then evaluate the model on a different set of fonts that are unseen by the network.
arXiv Detail & Related papers (2020-01-24T05:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.