Related papers: Confidence-Aware Document OCR Error Detection

Confidence-Aware Document OCR Error Detection

URL: http://arxiv.org/abs/2409.04117v1
Date: Fri, 6 Sep 2024 08:35:28 GMT
Title: Confidence-Aware Document OCR Error Detection
Authors: Arthur Hemmer, Mickaël Coustaty, Nicola Bartolo, Jean-Marc Ogier,
Abstract summary: We analyze the correlation between confidence scores and error rates across different OCR systems. We develop ConfBERT, a BERT-based model that incorporates OCR confidence scores into token embeddings.
Score: 1.003485566379789
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Optical Character Recognition (OCR) continues to face accuracy challenges that impact subsequent applications. To address these errors, we explore the utility of OCR confidence scores for enhancing post-OCR error detection. Our study involves analyzing the correlation between confidence scores and error rates across different OCR systems. We develop ConfBERT, a BERT-based model that incorporates OCR confidence scores into token embeddings and offers an optional pre-training phase for noise adjustment. Our experimental results demonstrate that integrating OCR confidence scores can enhance error detection capabilities. This work underscores the importance of OCR confidence scores in improving detection accuracy and reveals substantial disparities in performance between commercial and open-source OCR technologies.

Related papers

AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning [61.28113271728859]
RAG has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>Standard RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>In this work, we reinterpret RAG as Retrieval-Augmented Reasoning and identify a central but underexplored problem: textitReasoning Misalignment.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces [5.266869303483375]
This study evaluates the reliability of confidence scores for error detection through a comprehensive analysis of end-to-end ASR models. Results show that while confidence scores correlate with transcription accuracy, their error detection performance is limited. These findings highlight the limitations of confidence scores and the need for more sophisticated approaches to improve user interaction and explainability of ASR results.
arXiv Detail & Related papers (2025-03-19T11:33:40Z)
To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts [10.748768620243982]
Large Language Models (LLMs) are often augmented with external contexts, such as those used in retrieval-augmented generation (RAG) We show that when provided with both correct and incorrect contexts, both open-source and proprietary models tend to overly rely on external information. We propose two approaches: Self-Guided Confidence Reasoning (SCR) and Rule-Based Confidence Reasoning (RCR)
arXiv Detail & Related papers (2024-10-18T17:59:47Z)
A Coin Has Two Sides: A Novel Detector-Corrector Framework for Chinese Spelling Correction [79.52464132360618]
Chinese Spelling Correction (CSC) stands as a foundational Natural Language Processing (NLP) task. We introduce a novel approach based on error detector-corrector framework. Our detector is designed to yield two error detection results, each characterized by high precision and recall.
arXiv Detail & Related papers (2024-09-06T09:26:45Z)
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models [0.0]
This paper introduces Context Leveraging OCR Correction (CLOCR-C) It uses the infilling and context-adaptive abilities of transformer-based language models (LMs) to improve OCR quality. The study aims to determine if LMs can perform post-OCR correction, improve downstream NLP tasks, and the value of providing socio-cultural context as part of the correction process.
arXiv Detail & Related papers (2024-08-30T17:26:05Z)
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition [52.624909026294105]
We propose a non-autoregressive speech error correction method. A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses. The proposed system reduces the error rate by 21% compared with the ASR model.
arXiv Detail & Related papers (2024-06-29T17:56:28Z)
Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation [96.78845113346809]
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks. This paper proposes SynCheck, a lightweight monitor that leverages fine-grained decoding dynamics to detect unfaithful sentences. We also introduce FOD, a faithfulness-oriented decoding algorithm guided by beam search for long-form retrieval-augmented generation.
arXiv Detail & Related papers (2024-06-19T16:42:57Z)
Cross-modal Active Complementary Learning with Self-refining Correspondence [54.61307946222386]
We propose a Cross-modal Robust Complementary Learning framework (CRCL) to improve the robustness of existing methods. ACL exploits active and complementary learning losses to reduce the risk of providing erroneous supervision. SCC utilizes multiple self-refining processes with momentum correction to enlarge the receptive field for correcting correspondences.
arXiv Detail & Related papers (2023-10-26T15:15:11Z)
Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification. We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate. We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z)
Enhancing OCR Performance through Post-OCR Models: Adopting Glyph Embedding for Improved Correction [0.0]
The novelty of our approach lies in embedding the OCR output using CharBERT and our unique embedding technique, capturing the visual characteristics of characters. Our findings show that post-OCR correction effectively addresses deficiencies in inferior OCR models, and glyph embedding enables the model to achieve superior results.
arXiv Detail & Related papers (2023-08-29T12:41:50Z)
User-Centric Evaluation of OCR Systems for Kwak'wala [92.73847703011353]
We show that utilizing OCR reduces the time spent in the manual transcription of culturally valuable documents by over 50%. Our results demonstrate the potential benefits that OCR tools can have on downstream language documentation and revitalization efforts.
arXiv Detail & Related papers (2023-02-26T21:41:15Z)
iOCR: Informed Optical Character Recognition for Election Ballot Tallies [13.343515845758398]
iOCR was developed with a spell correction algorithm to fix errors introduced by conventional OCR for vote tabulation. The results found that the iOCR system outperforms conventional OCR techniques.
arXiv Detail & Related papers (2022-08-01T13:50:13Z)
Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR documents [2.6201102730518606]
We demonstrate an effective framework for mitigating OCR errors for any downstream NLP task. We first address the data scarcity problem for model training by constructing a document synthesis pipeline. For the benefit of the community, we have made the document synthesis pipeline available as an open-source project.
arXiv Detail & Related papers (2021-08-06T00:32:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.