An Efficient Deep Learning-Based Approach to Automating Invoice Document Validation
- URL: http://arxiv.org/abs/2503.12267v1
- Date: Sat, 15 Mar 2025 21:33:00 GMT
- Title: An Efficient Deep Learning-Based Approach to Automating Invoice Document Validation
- Authors: Aziz Amari, Mariem Makni, Wissal Fnaich, Akram Lahmar, Fedi Koubaa, Oumayma Charrad, Mohamed Ali Zormati, Rabaa Youssef Douss,
- Abstract summary: We propose to automate the validation of machine written invoices using document layout analysis and object detection techniques.<n>We introduce a novel dataset consisting of manually annotated real-world invoices and a multi-criteria validation process.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In large organizations, the number of financial transactions can grow rapidly, driving the need for fast and accurate multi-criteria invoice validation. Manual processing remains error-prone and time-consuming, while current automated solutions are limited by their inability to support a variety of constraints, such as documents that are partially handwritten or photographed with a mobile phone. In this paper, we propose to automate the validation of machine written invoices using document layout analysis and object detection techniques based on recent deep learning (DL) models. We introduce a novel dataset consisting of manually annotated real-world invoices and a multi-criteria validation process. We fine-tune and benchmark the most relevant DL models on our dataset. Experimental results show the effectiveness of the proposed pipeline and selected DL models in terms of achieving fast and accurate validation of invoices.
Related papers
- PaperAudit-Bench: Benchmarking Error Detection in Research Papers for Critical Automated Peer Review [54.141490756509306]
We introduce PaperAudit-Bench, which consists of two components: PaperAudit-Dataset, an error dataset, and PaperAudit-Review, an automated review framework.<n>Experiments on PaperAudit-Bench reveal large variability in error detectability across models and detection depths.<n>We show that the dataset supports training lightweight LLM detectors via SFT and RL, enabling effective error detection at reduced computational cost.
arXiv Detail & Related papers (2026-01-07T04:26:12Z) - Synthetic Data-Driven Prompt Tuning for Financial QA over Tables and Documents [21.737958911422805]
We introduce a self-improving prompt framework driven by data-augmented optimization.<n>We generate synthetic financial tables and document excerpts, verify their correctness and robustness then update the prompt based on the results.<n>We achieve higher performance in both accuracy and robustness than standard prompt methods.
arXiv Detail & Related papers (2025-11-09T09:07:31Z) - FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models [79.41859481668618]
Large Language Models (LLMs) have significantly advanced the fact-checking studies.
Existing automated fact-checking evaluation methods rely on static datasets and classification metrics.
We introduce FACT-AUDIT, an agent-driven framework that adaptively and dynamically assesses LLMs' fact-checking capabilities.
arXiv Detail & Related papers (2025-02-25T07:44:22Z) - M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework [75.95430061891828]
We introduce M-LongDoc, a benchmark of 851 samples, and an automated framework to evaluate the performance of large multimodal models.
We propose a retrieval-aware tuning approach for efficient and effective multimodal document reading.
arXiv Detail & Related papers (2024-11-09T13:30:38Z) - Document Classification using File Names [7.130525292849283]
Rapid document classification is critical in several time-sensitive applications like digital forensics and large-scale media classification.<n>Traditional approaches that rely on heavy-duty deep learning models fall short due to high inference times over vast input datasets and computational resources associated with analyzing whole documents.<n>We present a method using lightweight supervised learning models, combined with a TF-IDF feature extraction-based tokenization method, to accurately and efficiently classify documents based solely on file names.
arXiv Detail & Related papers (2024-10-02T01:42:19Z) - MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking [0.283600654802951]
We present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal datasets.
We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths.
Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset.
arXiv Detail & Related papers (2024-07-18T01:33:20Z) - OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs [27.89053798151106]
OpenFactCheck is a framework for building customized automatic fact-checking systems.
It allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims.
CheckerEVAL is a solution for gauging the reliability of automatic fact-checkers' verification results using human-annotated datasets.
arXiv Detail & Related papers (2024-05-09T07:15:19Z) - TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios [51.66718740300016]
TableLLM is a robust large language model (LLM) with 8 billion parameters.
TableLLM is purpose-built for proficiently handling data manipulation tasks.
We have released the model checkpoint, source code, benchmarks, and a web application for user interaction.
arXiv Detail & Related papers (2024-03-28T11:21:12Z) - Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers [121.53749383203792]
We present a holistic end-to-end solution for annotating the factuality of large language models (LLMs)-generated responses.
We construct an open-domain document-level factuality benchmark in three-level granularity: claim, sentence and document.
Preliminary experiments show that FacTool, FactScore and Perplexity are struggling to identify false claims.
arXiv Detail & Related papers (2023-11-15T14:41:57Z) - Multimodal Document Analytics for Banking Process Automation [4.541582055558865]
The paper contributes original empirical evidence on the effectiveness and efficiency of multi-model models for document processing in the banking business.
It offers practical guidance on how to unlock this potential in day-to-day operations.
arXiv Detail & Related papers (2023-07-21T18:29:04Z) - Radically Lower Data-Labeling Costs for Visually Rich Document
Extraction Models [13.16696804867477]
We propose Selective Labeling to simplify the labeling task.
We show through experiments that selective labeling can reduce the cost of acquiring labeled data by $10times$ with a negligible loss in accuracy.
arXiv Detail & Related papers (2022-10-28T20:10:16Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Generating Fact Checking Explanations [52.879658637466605]
A crucial piece of the puzzle that is still missing is to understand how to automate the most elaborate part of the process.
This paper provides the first study of how these explanations can be generated automatically based on available claim context.
Our results indicate that optimising both objectives at the same time, rather than training them separately, improves the performance of a fact checking system.
arXiv Detail & Related papers (2020-04-13T05:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.