HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System
- URL: http://arxiv.org/abs/2206.02628v1
- Date: Wed, 1 Jun 2022 09:57:34 GMT
- Title: HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System
- Authors: Bao-Sinh Nguyen, Quang-Bach Tran, Tuan-Anh Nguyen Dang, Duc Nguyen,
Hung Le
- Abstract summary: We propose a complete and novel architecture to measure confidence of current deep learning models in document information extraction task.
Our architecture consists of a Multi-modal Conformal Predictor and a Variational Cluster-oriented Anomaly Detector.
We evaluate our architecture on real-wold datasets, not only outperforming competing confidence estimators by a huge margin but also demonstrating generalization ability to out-of-distribution data.
- Score: 16.542137414609602
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Measuring the confidence of AI models is critical for safely deploying AI in
real-world industrial systems. One important application of confidence
measurement is information extraction from scanned documents. However, there
exists no solution to provide reliable confidence score for current
state-of-the-art deep-learning-based information extractors. In this paper, we
propose a complete and novel architecture to measure confidence of current deep
learning models in document information extraction task. Our architecture
consists of a Multi-modal Conformal Predictor and a Variational
Cluster-oriented Anomaly Detector, trained to faithfully estimate its
confidence on its outputs without the need of host models modification. We
evaluate our architecture on real-wold datasets, not only outperforming
competing confidence estimators by a huge margin but also demonstrating
generalization ability to out-of-distribution data.
Related papers
- A Context-Aware Dual-Metric Framework for Confidence Estimation in Large Language Models [6.62851757612838]
Current confidence estimation methods for large language models (LLMs) neglect the relevance between responses and contextual information.<n>We propose CRUX, which integrates context faithfulness and consistency for confidence estimation via two novel metrics.<n> Experiments across three benchmark datasets demonstrate CRUX's effectiveness, achieving the highest AUROC than existing baselines.
arXiv Detail & Related papers (2025-08-01T12:58:34Z) - Towards Trustworthy AI: Secure Deepfake Detection using CNNs and Zero-Knowledge Proofs [0.0]
TrustDefender is a framework that detects deepfake imagery in real-time extended reality (XR) streams.<n>Our work establishes a foundation for reliable AI in immersive and privacy-sensitive applications.
arXiv Detail & Related papers (2025-07-22T20:47:46Z) - Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG [51.120170062795566]
We propose Divide-Then-Align (DTA) to endow RAG systems with the ability to respond with "I don't know" when the query is out of the knowledge boundary.<n>DTA balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-27T08:21:21Z) - Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification [3.141006099594433]
We propose a novel methodology termed as attention head masking (AHM) for multi-modal OOD tasks in document classification systems.
Our empirical results demonstrate that the proposed AHM method outperforms all state-of-the-art approaches.
To address the scarcity of high-quality publicly available document datasets, we introduce FinanceDocs, a new document AI dataset.
arXiv Detail & Related papers (2024-08-20T23:30:00Z) - Confidence-Aware Sub-Structure Beam Search (CABS): Mitigating Hallucination in Structured Data Generation with Large Language Models [6.099774114286838]
Confidence estimation methods on Large Language Models (LLMs) primarily focus on the confidence at the individual token level or the entire output sequence level.
We propose Confidence-Aware sub-structure Beam Search (CABS), a novel decoding method operating at the sub-structure level in structured data generation.
Results show that CABS outperforms traditional token-level beam search for structured data generation by 16.7% Recall at 90% precision averagely on the problem of product attribute generation.
arXiv Detail & Related papers (2024-05-30T18:21:05Z) - Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models [14.5291643644017]
We introduce the concept of Confidence-Probability Alignment.
We probe the alignment between models' internal and expressed confidence.
Among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment.
arXiv Detail & Related papers (2024-05-25T15:42:04Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - A Confidence-based Partial Label Learning Model for Crowd-Annotated
Named Entity Recognition [74.79785063365289]
Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets.
We propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER.
arXiv Detail & Related papers (2023-05-21T15:31:23Z) - Federated Learning with Unreliable Clients: Performance Analysis and
Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients.
However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training.
We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z) - An evaluation of word-level confidence estimation for end-to-end
automatic speech recognition [70.61280174637913]
We investigate confidence estimation for end-to-end automatic speech recognition (ASR)
We provide an extensive benchmark of popular confidence methods on four well-known speech datasets.
Our results suggest a strong baseline can be obtained by scaling the logits by a learnt temperature.
arXiv Detail & Related papers (2021-01-14T09:51:59Z) - Confidence Estimation via Auxiliary Models [47.08749569008467]
We introduce a novel target criterion for model confidence, namely the true class probability ( TCP)
We show that TCP offers better properties for confidence estimation than standard maximum class probability (MCP)
arXiv Detail & Related papers (2020-12-11T17:21:12Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.