Related papers: Enhancing OCR Performance through Post-OCR Models: Adopting Glyph Embedding for Improved Correction

Enhancing OCR Performance through Post-OCR Models: Adopting Glyph Embedding for Improved Correction

URL: http://arxiv.org/abs/2308.15262v1
Date: Tue, 29 Aug 2023 12:41:50 GMT
Title: Enhancing OCR Performance through Post-OCR Models: Adopting Glyph Embedding for Improved Correction
Authors: Yung-Hsin Chen and Yuli Zhou
Abstract summary: The novelty of our approach lies in embedding the OCR output using CharBERT and our unique embedding technique, capturing the visual characteristics of characters. Our findings show that post-OCR correction effectively addresses deficiencies in inferior OCR models, and glyph embedding enables the model to achieve superior results.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The study investigates the potential of post-OCR models to overcome limitations in OCR models and explores the impact of incorporating glyph embedding on post-OCR correction performance. In this study, we have developed our own post-OCR correction model. The novelty of our approach lies in embedding the OCR output using CharBERT and our unique embedding technique, capturing the visual characteristics of characters. Our findings show that post-OCR correction effectively addresses deficiencies in inferior OCR models, and glyph embedding enables the model to achieve superior results, including the ability to correct individual words.

Related papers

Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts [0.0]
We show that fine-tuning can reliably start with just 16 lines, yielding a 10% relative improvement in CER, and scale up to 40% with 256 lines. We also show that OCR models can be leveraged to cut annotation costs by half through confidence-based selection of informative lines.
arXiv Detail & Related papers (2025-03-25T11:01:05Z)
Enabling Scalable Oversight via Self-Evolving Critic [59.861013614500024]
SCRIT (Self-evolving CRITic) is a framework that enables genuine self-evolution of critique abilities. It self-improves by training on synthetic data, generated by a contrastive-based self-critic. It achieves up to a 10.3% improvement on critique-correction and error identification benchmarks.
arXiv Detail & Related papers (2025-01-10T05:51:52Z)
Confidence-Aware Document OCR Error Detection [1.003485566379789]
We analyze the correlation between confidence scores and error rates across different OCR systems. We develop ConfBERT, a BERT-based model that incorporates OCR confidence scores into token embeddings.
arXiv Detail & Related papers (2024-09-06T08:35:28Z)
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models [0.0]
This paper introduces Context Leveraging OCR Correction (CLOCR-C) It uses the infilling and context-adaptive abilities of transformer-based language models (LMs) to improve OCR quality. The study aims to determine if LMs can perform post-OCR correction, improve downstream NLP tasks, and the value of providing socio-cultural context as part of the correction process.
arXiv Detail & Related papers (2024-08-30T17:26:05Z)
Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter [57.64003871384959]
This work presents a new approach to fast context-biasing with CTC-based Word Spotter. The proposed method matches CTC log-probabilities against a compact context graph to detect potential context-biasing candidates. The results demonstrate a significant acceleration of the context-biasing recognition with a simultaneous improvement in F-score and WER.
arXiv Detail & Related papers (2024-06-11T09:37:52Z)
Data Generation for Post-OCR correction of Cyrillic handwriting [41.94295877935867]
This paper focuses on the development and application of a synthetic handwriting generation engine based on B'ezier curves. Such an engine generates highly realistic handwritten text in any amounts, which we utilize to create a substantial dataset. We apply a Handwritten Text Recognition (HTR) model to this dataset to identify OCR errors, forming the basis for our POC model training.
arXiv Detail & Related papers (2023-11-27T15:01:26Z)
Cross-modal Active Complementary Learning with Self-refining Correspondence [54.61307946222386]
We propose a Cross-modal Robust Complementary Learning framework (CRCL) to improve the robustness of existing methods. ACL exploits active and complementary learning losses to reduce the risk of providing erroneous supervision. SCC utilizes multiple self-refining processes with momentum correction to enlarge the receptive field for correcting correspondences.
arXiv Detail & Related papers (2023-10-26T15:15:11Z)
User-Centric Evaluation of OCR Systems for Kwak'wala [92.73847703011353]
We show that utilizing OCR reduces the time spent in the manual transcription of culturally valuable documents by over 50%. Our results demonstrate the potential benefits that OCR tools can have on downstream language documentation and revitalization efforts.
arXiv Detail & Related papers (2023-02-26T21:41:15Z)
iOCR: Informed Optical Character Recognition for Election Ballot Tallies [13.343515845758398]
iOCR was developed with a spell correction algorithm to fix errors introduced by conventional OCR for vote tabulation. The results found that the iOCR system outperforms conventional OCR techniques.
arXiv Detail & Related papers (2022-08-01T13:50:13Z)
An Evaluation of OCR on Egocentric Data [30.637021477342035]
In this paper, we evaluate state-of-the-art OCR methods on Egocentric data. We demonstrate that existing OCR methods struggle with rotated text, which is frequently observed on objects being handled. We introduce a simple rotate-and-merge procedure which can be applied to pre-trained OCR models that halves the normalized edit distance error.
arXiv Detail & Related papers (2022-06-11T10:37:20Z)
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition [71.96870151495536]
We propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR) The AR procedure aims to modify the acoustic signals (from the target domain) to repurpose a pretrained SCR model. We evaluate the proposed AR-SCR system on three low-resource SCR datasets, including Arabic, Lithuanian, and dysarthric Mandarin speech.
arXiv Detail & Related papers (2021-10-08T05:07:35Z)
A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets. There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected. We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)
Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data. This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.