Related papers: POLARIS: Typed Planning and Governed Execution for Agentic AI in Back-Office Automation

POLARIS: Typed Planning and Governed Execution for Agentic AI in Back-Office Automation

URL: http://arxiv.org/abs/2601.11816v1
Date: Fri, 16 Jan 2026 22:38:21 GMT
Title: POLARIS: Typed Planning and Governed Execution for Agentic AI in Back-Office Automation
Authors: Zahra Moslemi, Keerthi Koneru, Yen-Ting Lee, Sheethal Kumar, Ramesh Radhakrishnan,
Abstract summary: POLARIS is a governed orchestration framework that treats automation as typed plan synthesis and validated execution over LLM agents.<n> Empirically, POLARIS achieves a micro F1 of 0.81 on the SROIE dataset and, on a controlled synthetic suite, achieves 0.95 to 1.00 precision for anomaly routing with preserved audit trails.
Score: 0.28055179094637683
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Enterprise back office workflows require agentic systems that are auditable, policy-aligned, and operationally predictable, capabilities that generic multi-agent setups often fail to deliver. We present POLARIS (Policy-Aware LLM Agentic Reasoning for Integrated Systems), a governed orchestration framework that treats automation as typed plan synthesis and validated execution over LLM agents. A planner proposes structurally diverse, type checked directed acyclic graphs (DAGs), a rubric guided reasoning module selects a single compliant plan, and execution is guarded by validator gated checks, a bounded repair loop, and compiled policy guardrails that block or route side effects before they occur. Applied to document centric finance tasks, POLARIS produces decision grade artifacts and full execution traces while reducing human intervention. Empirically, POLARIS achieves a micro F1 of 0.81 on the SROIE dataset and, on a controlled synthetic suite, achieves 0.95 to 1.00 precision for anomaly routing with preserved audit trails. These evaluations constitute an initial benchmark for governed Agentic AI. POLARIS provides a methodological and benchmark reference for policy-aligned Agentic AI. Keywords Agentic AI, Enterprise Automation, Back-Office Tasks, Benchmarks, Governance, Typed Planning, Evaluation

Related papers

ALRM: Agentic LLM for Robotic Manipulation [3.7473235317736058]
Large Language Models (LLMs) have recently empowered agentic frameworks to exhibit advanced reasoning and planning capabilities.<n>Large Language Models (LLMs) have recently empowered agentic frameworks to exhibit advanced reasoning and planning capabilities.
arXiv Detail & Related papers (2026-01-27T11:54:14Z)
AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior [20.817336331051752]
AgentGuardian governs and protects AI agent operations by enforcing context-aware access-control policies.<n>It effectively detects malicious or misleading inputs while preserving normal agent functionality.
arXiv Detail & Related papers (2026-01-15T14:33:36Z)
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem [90.17610617854247]
We introduce the Agentic Learning Ecosystem (ALE), a foundational infrastructure that optimize the production pipeline for agentic model.<n>ALE consists of three components: ROLL, a post-training framework for weight optimization; ROCK, a sandbox environment manager for trajectory generation; and iFlow CLI, an agent framework for efficient context engineering.<n>We release ROME, an open-source agent grounded by ALE and trained on over one million trajectories.
arXiv Detail & Related papers (2025-12-31T14:03:39Z)
Alita-G: Self-Evolving Generative Agent for Agent Generation [54.49365835457433]
We present ALITA-G, a framework that transforms a general-purpose agent into a domain expert.<n>In this framework, a generalist agent executes a curated suite of target-domain tasks.<n>It attains strong gains while reducing computation costs.
arXiv Detail & Related papers (2025-10-27T17:59:14Z)
Are Agents Just Automata? On the Formal Equivalence Between Agentic AI and the Chomsky Hierarchy [4.245979127318219]
This paper establishes a formal equivalence between the architectural classes of modern agentic AI systems and the abstract machines of the hierarchy.<n>We demonstrate that simple reflex agents are equivalent to Finite Automata, hierarchical task-decomposition agents are equivalent to Pushdown Automata, and agents employing readable/writable memory for reflection are equivalent to TMs.
arXiv Detail & Related papers (2025-10-27T16:22:02Z)
What Is Your Agent's GPA? A Framework for Evaluating Agent Goal-Plan-Action Alignment [3.5583478152586756]
Agent GPA is an evaluation paradigm based on an agent's operational loop of setting goals, devising plans, and executing actions.<n>The framework includes five evaluation metrics: Goal Fulfillment, Logical Consistency, Execution Efficiency, Plan Quality, and Plan Adherence.
arXiv Detail & Related papers (2025-10-09T22:40:19Z)
Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents [0.19336815376402716]
We introduce a regulatory machine learning framework that converts unstructured design artifacts (like PRDs, TDDs, and code) into verifiable runtime guardrails.<n>Our Policy as Prompt method reads these documents and risk controls to build a source-linked policy tree.<n>System is built to enforce least privilege and data minimization.
arXiv Detail & Related papers (2025-09-28T17:36:52Z)
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z)
HADA: Human-AI Agent Decision Alignment Architecture [0.0]
HADA is a protocol- and framework reference architecture that keeps both large language model (LLM) agents and legacy algorithms aligned with organizational targets and values.<n>Technical and non-technical actors can query, steer, audit, or contest every decision across strategic, tactical, and real-time horizons.
arXiv Detail & Related papers (2025-06-01T14:04:52Z)
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models.<n>Controlled Decoding provides a mechanism for aligning a model at inference time without retraining.<n>We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems.<n>In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy.<n> Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z)
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA. It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.