Related papers: AgenticIE: An Adaptive Agent for Information Extraction from Complex Regulatory Documents

Related papers

IDP Accelerator: Agentic Document Intelligence from Extraction to Compliance Validation [3.539467892338473]
We present IDP (Intelligent Document Processing) Accelerator, a framework enabling agentic AI for end-to-end document intelligence.<n>The interactive demonstration enables users to upload document packets, visualize classification results, and explore extracted data.
arXiv Detail & Related papers (2026-02-26T20:20:38Z)
The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z)
MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns [80.05126590825121]
MonkeyOCR v1.5 is a unified vision-language framework that enhances both layout understanding and content recognition.<n>To address complex table structures, we propose a visual consistency-based reinforcement learning scheme.<n>Two specialized modules, Image-Decoupled Table Parsing and Type-Guided Table Merging, are introduced to enable reliable parsing of tables.
arXiv Detail & Related papers (2025-11-13T15:12:17Z)
Hybrid OCR-LLM Framework for Enterprise-Scale Document Information Extraction Under Copy-heavy Task [11.672798725644121]
This work strategically combines OCR engines with Large Language Models (LLMs) to optimize the accuracy-efficiency trade-off inherent in repetitive document extraction tasks.<n>We implement and evaluate 25 configurations across three extraction paradigms (direct, replacement, and table-based) on identity documents spanning four formats.
arXiv Detail & Related papers (2025-10-11T09:40:34Z)
MMESGBench: Pioneering Multimodal Understanding and Complex Reasoning Benchmark for ESG Tasks [56.350173737493215]
Environmental, Social, and Governance (ESG) reports are essential for evaluating sustainability practices, ensuring regulatory compliance, and promoting financial transparency.<n>MMESGBench is a first-of-its-kind benchmark dataset to evaluate multimodal understanding and complex reasoning across structurally diverse and multi-source ESG documents.<n>MMESGBench comprises 933 validated QA pairs derived from 45 ESG documents, spanning across seven distinct document types and three major ESG source categories.
arXiv Detail & Related papers (2025-07-25T03:58:07Z)
From Chaos to Automation: Enabling the Use of Unstructured Data for Robotic Process Automation [0.6144680854063939]
The UNstructured Document REtrieval SyStem (UNDRESS) is a system that uses fuzzy regular expressions, techniques for natural language processing, and large language models to enable RPA platforms to effectively retrieve information from unstructured documents.<n>The results demonstrate the effectiveness of UNDRESS in enhancing RPA capabilities for unstructured data, providing a significant advancement in the field.
arXiv Detail & Related papers (2025-07-15T14:32:49Z)
eSapiens: A Real-World NLP Framework for Multimodal Document Understanding and Enterprise Knowledge Processing [6.450269621190948]
We introduce eSapiens, a unified question-answering system designed for enterprise settings.<n>eSapiens bridges structured databases and unstructured corpora via a dual-module architecture.<n>We evaluate eSapiens on the RAGTruth benchmark, analyzing performance across key dimensions such as completeness, hallucination, and context utilization.
arXiv Detail & Related papers (2025-06-20T06:07:20Z)
Structuring the Unstructured: A Multi-Agent System for Extracting and Querying Financial KPIs and Guidance [54.25184684077833]
We propose an efficient and scalable method for extracting quantitative insights from unstructured financial documents.<n>Our proposed system consists of two specialized agents: the emphExtraction Agent and the emphText-to-Agent
arXiv Detail & Related papers (2025-05-25T15:45:46Z)
An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents [0.0]
We propose an agentic AI system that leverages Large Language Model (LLM) agents and a reinforcement learning driver agent to automate consistent, self-improving extraction.<n>Our work highlights the limitations of monolithic LLM-based extraction and introduces a modular, multi-agent framework with task-specific prompts.<n>This self-corrective adaptive system handles diverse documents, file formats, layouts, and LLMs, aiming to automate accurate information extraction without the need for human intervention.
arXiv Detail & Related papers (2025-05-16T09:46:10Z)
QID: Efficient Query-Informed ViTs in Data-Scarce Regimes for OCR-free Visual Document Understanding [53.69841526266547]
Fine-tuning a pre-trained Vision-Language Model with new datasets often falls short in optimizing the vision encoder.<n>We introduce QID, a novel, streamlined, architecture-preserving approach that integrates query embeddings into the vision encoder.
arXiv Detail & Related papers (2025-04-03T18:47:16Z)
SOPBench: Evaluating Language Agents at Following Standard Operating Procedures and Constraints [59.645885492637845]
SOPBench is an evaluation pipeline that transforms each service-specific SOP code program into a directed graph of executable functions.<n>Our approach transforms each service-specific SOP code program into a directed graph of executable functions and requires agents to call these functions based on natural language SOP descriptions.<n>We evaluate 18 leading models, and results show the task is challenging even for top-tier models.
arXiv Detail & Related papers (2025-03-11T17:53:02Z)
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models [58.45517851437422]
Visually-situated text parsing (VsTP) has recently seen notable advancements, driven by the growing demand for automated document understanding.<n>Existing solutions often rely on task-specific architectures and objectives for individual tasks.<n>In this paper, we introduce Omni V2, a universal model that unifies VsTP typical tasks, including text spotting, key information extraction, table recognition, and layout analysis.
arXiv Detail & Related papers (2025-02-22T09:32:01Z)
GENIE: Generative Note Information Extraction model for structuring EHR data [14.057531175321113]
We introduce GENIE, a Generative Note Information Extraction system.<n> GENIE processes entire paragraphs in a single pass, extracting entities, assertion statuses, locations, modifier, values, and purposes with high accuracy.<n>Using a robust data preparation pipeline and fine-tuned small scale LLMs, GENIE achieves competitive performance across multiple information extraction tasks.
arXiv Detail & Related papers (2025-01-30T15:42:24Z)
Document Structure in Long Document Transformers [64.76981299465885]
Long documents often exhibit structure with hierarchically organized elements of different functions, such as section headers and paragraphs. Despite the omnipresence of document structure, its role in natural language processing (NLP) remains opaque. Do long-document Transformer models acquire an internal representation of document structure during pre-training? How can structural information be communicated to a model after pre-training, and how does it influence downstream performance?
arXiv Detail & Related papers (2024-01-31T08:28:06Z)
Validation of Rigorous Requirements Specifications and Document Automation with the ITLingo RSL Language [0.0]
ITLingo initiative has introduced a requirements specification language named RSL to enhance the rigor and consistency of technical documentation. This paper reviews existing research and tools in the fields of requirements validation and document automation. We propose to extend RSL with validation of specifications based on customized checks, and on linguistic rules dynamically defined in the RSL itself.
arXiv Detail & Related papers (2023-12-17T21:39:26Z)
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users. We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z)
Unified Structure Generation for Universal Information Extraction [58.89057387608414]
UIE can universally model different IE tasks, adaptively generate targeted structures, and collaboratively learn general IE abilities from different knowledge sources. Experiments show that UIE achieved the state-of-the-art performance on 4 IE tasks, 13 datasets, and on all supervised, low-resource, and few-shot settings.
arXiv Detail & Related papers (2022-03-23T08:49:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.