Enhancing OCR Performance through Post-OCR Models: Adopting Glyph
Embedding for Improved Correction
- URL: http://arxiv.org/abs/2308.15262v1
- Date: Tue, 29 Aug 2023 12:41:50 GMT
- Title: Enhancing OCR Performance through Post-OCR Models: Adopting Glyph
Embedding for Improved Correction
- Authors: Yung-Hsin Chen and Yuli Zhou
- Abstract summary: The novelty of our approach lies in embedding the OCR output using CharBERT and our unique embedding technique, capturing the visual characteristics of characters.
Our findings show that post-OCR correction effectively addresses deficiencies in inferior OCR models, and glyph embedding enables the model to achieve superior results.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study investigates the potential of post-OCR models to overcome
limitations in OCR models and explores the impact of incorporating glyph
embedding on post-OCR correction performance. In this study, we have developed
our own post-OCR correction model. The novelty of our approach lies in
embedding the OCR output using CharBERT and our unique embedding technique,
capturing the visual characteristics of characters. Our findings show that
post-OCR correction effectively addresses deficiencies in inferior OCR models,
and glyph embedding enables the model to achieve superior results, including
the ability to correct individual words.
Related papers
- Enabling Scalable Oversight via Self-Evolving Critic [59.861013614500024]
SCRIT (Self-evolving CRITic) is a framework that enables genuine self-evolution of critique abilities.
It self-improves by training on synthetic data, generated by a contrastive-based self-critic.
It achieves up to a 10.3% improvement on critique-correction and error identification benchmarks.
arXiv Detail & Related papers (2025-01-10T05:51:52Z) - RoundTripOCR: A Data Generation Technique for Enhancing Post-OCR Error Correction in Low-Resource Devanagari Languages [41.09752906121257]
We propose an approach for synthetic data generation for Devanagari languages, RoundTripOCR.
We release post-OCR text correction datasets for Hindi, Marathi, Bodo, Nepali, Konkani and Sanskrit.
We also present a novel approach for OCR error correction by leveraging techniques from machine translation.
arXiv Detail & Related papers (2024-12-14T19:59:41Z) - Confidence-Aware Document OCR Error Detection [1.003485566379789]
We analyze the correlation between confidence scores and error rates across different OCR systems.
We develop ConfBERT, a BERT-based model that incorporates OCR confidence scores into token embeddings.
arXiv Detail & Related papers (2024-09-06T08:35:28Z) - CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models [0.0]
This paper introduces Context Leveraging OCR Correction (CLOCR-C)
It uses the infilling and context-adaptive abilities of transformer-based language models (LMs) to improve OCR quality.
The study aims to determine if LMs can perform post-OCR correction, improve downstream NLP tasks, and the value of providing the socio-cultural context as part of the correction process.
arXiv Detail & Related papers (2024-08-30T17:26:05Z) - Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter [57.64003871384959]
This work presents a new approach to fast context-biasing with CTC-based Word Spotter.
The proposed method matches CTC log-probabilities against a compact context graph to detect potential context-biasing candidates.
The results demonstrate a significant acceleration of the context-biasing recognition with a simultaneous improvement in F-score and WER.
arXiv Detail & Related papers (2024-06-11T09:37:52Z) - Cross-modal Active Complementary Learning with Self-refining
Correspondence [54.61307946222386]
We propose a Cross-modal Robust Complementary Learning framework (CRCL) to improve the robustness of existing methods.
ACL exploits active and complementary learning losses to reduce the risk of providing erroneous supervision.
SCC utilizes multiple self-refining processes with momentum correction to enlarge the receptive field for correcting correspondences.
arXiv Detail & Related papers (2023-10-26T15:15:11Z) - iOCR: Informed Optical Character Recognition for Election Ballot Tallies [13.343515845758398]
iOCR was developed with a spell correction algorithm to fix errors introduced by conventional OCR for vote tabulation.
The results found that the iOCR system outperforms conventional OCR techniques.
arXiv Detail & Related papers (2022-08-01T13:50:13Z) - An Evaluation of OCR on Egocentric Data [30.637021477342035]
In this paper, we evaluate state-of-the-art OCR methods on Egocentric data.
We demonstrate that existing OCR methods struggle with rotated text, which is frequently observed on objects being handled.
We introduce a simple rotate-and-merge procedure which can be applied to pre-trained OCR models that halves the normalized edit distance error.
arXiv Detail & Related papers (2022-06-11T10:37:20Z) - Neural Model Reprogramming with Similarity Based Mapping for
Low-Resource Spoken Command Recognition [71.96870151495536]
We propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR)
The AR procedure aims to modify the acoustic signals (from the target domain) to repurpose a pretrained SCR model.
We evaluate the proposed AR-SCR system on three low-resource SCR datasets, including Arabic, Lithuanian, and dysarthric Mandarin speech.
arXiv Detail & Related papers (2021-10-08T05:07:35Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z) - Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery.
Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data.
This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution.
We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.