Related papers: An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents

An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents

URL: http://arxiv.org/abs/2505.13504v1
Date: Fri, 16 May 2025 09:46:10 GMT
Title: An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents
Authors: Ayesha Amjad, Saurav Sthapit, Tahir Qasim Syed,
Abstract summary: We propose an agentic AI system that leverages Large Language Model (LLM) agents and a reinforcement learning driver agent to automate consistent, self-improving extraction.<n>Our work highlights the limitations of monolithic LLM-based extraction and introduces a modular, multi-agent framework with task-specific prompts.<n>This self-corrective adaptive system handles diverse documents, file formats, layouts, and LLMs, aiming to automate accurate information extraction without the need for human intervention.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Extracting alphanumeric data from form-like documents such as invoices, purchase orders, bills, and financial documents is often performed via vision (OCR) and learning algorithms or monolithic pipelines with limited potential for systemic improvements. We propose an agentic AI system that leverages Large Language Model (LLM) agents and a reinforcement learning (RL) driver agent to automate consistent, self-improving extraction under LLM inference uncertainty. Our work highlights the limitations of monolithic LLM-based extraction and introduces a modular, multi-agent framework with task-specific prompts and an RL policy of rewards and penalties to guide a meta-prompting agent to learn from past errors and improve prompt-based actor agents. This self-corrective adaptive system handles diverse documents, file formats, layouts, and LLMs, aiming to automate accurate information extraction without the need for human intervention. Results as reported on two benchmark datasets of SOIRE, and CORD, are promising for the agentic AI framework.

Related papers

Agent0: Leveraging LLM Agents to Discover Multi-value Features from Text for Enhanced Recommendations [0.0]
Large language models (LLMs) and their associated agent-based frameworks have significantly advanced automated information extraction.<n>This paper presents Agent0, an agent-based system designed to automate information extraction and feature construction from raw, unstructured text.
arXiv Detail & Related papers (2025-07-25T06:45:10Z)
SI-Agent: An Agentic Framework for Feedback-Driven Generation and Tuning of Human-Readable System Instructions for Large Language Models [0.0]
System Instructions (SIs) are pivotal for guiding Large Language Models (LLMs)<n>Existing automated methods frequently generate non-human-readable "soft prompts," sacrificing interpretability.<n>This paper introduces SI-Agent, a novel agentic framework designed to automatically generate and iteratively refine human-readable SIs.
arXiv Detail & Related papers (2025-07-03T23:44:50Z)
Agent-UniRAG: A Trainable Open-Source LLM Agent Framework for Unified Retrieval-Augmented Generation Systems [4.683612295430957]
This paper presents a novel approach for unified retrieval-augmented generation (RAG) systems using the recent emerging large language model (LLM) agent concept.<n>We propose a trainable agent framework called Agent-UniRAG for unified retrieval-augmented LLM systems.<n>The main idea is to design an LLM agent framework to solve RAG tasks step-by-step based on the complexity of the inputs.
arXiv Detail & Related papers (2025-05-28T16:46:31Z)
MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability [106.35604230971396]
Recent advancements in Agent techniques enable Large Language Models (LLMs) to autonomously utilize tools for retrieval, planning, and reasoning.<n>To further enhance the universal search capability of agents, we propose a novel pre-training framework, MaskSearch.<n>In the pre-training stage, we introduce the Retrieval Augmented Mask Prediction (RAMP) task, where the model learns to leverage search tools to fill masked spans.<n>After that, the model is trained on downstream tasks to achieve further improvement.
arXiv Detail & Related papers (2025-05-26T17:58:50Z)
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios [51.46347732659174]
Large Language Models (LLMs) have demonstrated advanced capabilities in real-world agentic applications.<n>AgentIF is the first benchmark for systematically evaluating LLM instruction following ability in agentic scenarios.
arXiv Detail & Related papers (2025-05-22T17:31:10Z)
Agent-Enhanced Large Language Models for Researching Political Institutions [0.0]
This paper demonstrates how Large Language Models (LLMs) can serve as dynamic agents capable of streamlining tasks.<n>Central to this approach is agentic retrieval-augmented generation (Agentic RAG)<n>To demonstrate the potential of this approach, we introduce CongressRA, an LLM agent designed to support scholars studying the U.S. Congress.
arXiv Detail & Related papers (2025-03-14T22:04:40Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.<n>However, they still struggle with problems requiring multi-step decision-making and environmental feedback.<n>We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE)<n>LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients.<n>It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z)
LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language Models [0.0]
LatteReview is a Python-based framework that leverages large language models (LLMs) and multi-agent systems to automate key elements of the systematic review process.<n>The framework supports features such as Retrieval-Augmented Generation (RAG) for incorporating external context, multimodal reviews, Pydantic-based validation for structured inputs and outputs, and asynchronous programming for handling large-scale datasets.
arXiv Detail & Related papers (2025-01-05T17:53:00Z)
AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline. Recent works have started exploiting large language models (LLM) to lessen such burden. This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z)
An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications [7.286515881369693]
This paper systematically investigates the capacity of Large Language Models (LLMs) to repair declarative specifications in Alloy.<n>We designed 12 different repair settings, encompassing single-agent and dual-agent paradigms, utilizing various LLMs.<n>Our study reveals that dual-agent with auto-prompting setup outperforms the other settings, albeit with a marginal increase in the number of iterations and token usage.
arXiv Detail & Related papers (2024-04-17T03:46:38Z)
Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations [53.76682562935373]
We introduce an efficient framework called textbfInteRecAgent, which employs LLMs as the brain and recommender models as tools. InteRecAgent achieves satisfying performance as a conversational recommender system, outperforming general-purpose LLMs.
arXiv Detail & Related papers (2023-08-31T07:36:44Z)
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes [54.13559879916708]
EVAPORATE is a prototype system powered by large language models (LLMs)<n>Code synthesis is cheap, but far less accurate than directly processing each document with the LLM.<n>We propose an extended code implementation, EVAPORATE-CODE+, which achieves better quality than direct extraction.
arXiv Detail & Related papers (2023-04-19T06:00:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.