Related papers: Transduction is All You Need for Structured Data Workflows

Transduction is All You Need for Structured Data Workflows

URL: http://arxiv.org/abs/2508.15610v2
Date: Mon, 29 Sep 2025 13:42:12 GMT
Title: Transduction is All You Need for Structured Data Workflows
Authors: Alfio Gliozzo, Naweed Khan, Christodoulos Constantinides, Nandana Mihindukulasooriya, Nahuel Defosse, Gaetano Rossiello, Junkyu Lee,
Abstract summary: This paper introduces Agentics, a functional agentic AI framework for building structured data workflow pipelines.<n>Designed for both research and practical applications, Agentics offers a new data-centric paradigm in which agents are embedded within data types.<n>We present a range of structured data workflow tasks and empirical evidence demonstrating the effectiveness of this approach.
Score: 8.178153196011028
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces Agentics, a functional agentic AI framework for building LLM-based structured data workflow pipelines. Designed for both research and practical applications, Agentics offers a new data-centric paradigm in which agents are embedded within data types, enabling logical transduction between structured states. This design shifts the focus toward principled data modeling, providing a declarative language where data types are directly exposed to large language models and composed through transductions triggered by type connections. We present a range of structured data workflow tasks and empirical evidence demonstrating the effectiveness of this approach, including data wrangling, text-to-SQL semantic parsing, and domain-specific multiple-choice question answering. The open source Agentics is available at https://github.com/IBM/Agentics.

Related papers

Agentics 2.0: Logical Transduction Algebra for Agentic Data Workflows [3.0955233217110045]
We present Agentics 2.0, a lightweight, Python-native framework for building high-quality, structured, explainable, and type-safe agentic data.<n>At the core of Agentics 2.0, the logical algebra formalizes a large language model inference call as a typed semantic transformation.<n>The proposed framework provides semantic reliability through strong typing, semantic observability, and evidence tracing.
arXiv Detail & Related papers (2026-03-04T16:30:01Z)
DataJoint 2.0: A Computational Substrate for Agentic Scientific Workflows [0.0]
DataJoint creates a substrate for SciOps where agents can participate in scientific transformations without risking data corruption.<n>Tables represent workflow steps, rows represent artifacts, foreign keys prescribe execution order.<n>Single formal system where data structure, computational dependencies, and integrity constraints are all queryable, enforceable, and machine-readable.
arXiv Detail & Related papers (2026-02-18T16:35:47Z)
OmniStruct: Universal Text-to-Structure Generation across Diverse Schemas [57.49565459553627]
We introduce OmniStruct, a benchmark for assessing Large Language Models' capabilities on text-to-structure tasks.<n>We collect high-quality training data via synthetic task generation to facilitate the development of efficient text-to-structure models.<n>Our experiments demonstrate the possibility of fine-tuning much smaller models on synthetic data into universal structured generation models.
arXiv Detail & Related papers (2025-11-23T08:18:12Z)
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents [85.02904078131682]
We introduce the agent data protocol (ADP), a light-weight representation language that serves as an "interlingua" between agent datasets.<n> ADP is expressive enough to capture a large variety of tasks, including API/tool use, browsing, coding, software engineering, and general agentic.<n>All code and data are released publicly, in the hope that ADP could help lower the barrier to standardized, scalable, and reproducible agent training.
arXiv Detail & Related papers (2025-10-28T17:53:13Z)
LLM/Agent-as-Data-Analyst: A Survey [54.08761322298559]
Large language models (LLMs) and agent techniques have brought a fundamental shift in the functionality and development paradigm of data analysis tasks.<n>LLMs enable complex data understanding, natural language, semantic analysis functions, and autonomous pipeline orchestration.
arXiv Detail & Related papers (2025-09-28T17:31:38Z)
LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology [3.470217255779291]
We introduce an evaluation methodology, reference architecture, and open-source implementation that leverages interactive Large Language Model (LLM) agents for runtime data analysis.<n>Our approach uses a lightweight, metadata-driven design that translates natural language into structured provenance queries.<n> Evaluations across LLaMA, GPT, Gemini, and Claude, covering diverse query classes and a real-world chemistry workflow, show that modular design, prompt tuning, and Retrieval-Augmented Generation (RAG) enable accurate and insightful agent responses.
arXiv Detail & Related papers (2025-09-17T13:51:29Z)
AgenticData: An Agentic Data Analytics System for Heterogeneous Data [12.67277567222908]
AgenticData is an agentic data analytics system that allows users to pose natural language (NL) questions while autonomously analyzing data sources across multiple domains.<n>We propose a multi-agent collaboration strategy by utilizing a data profiling agent for discovering relevant data, a semantic cross-validation agent for iterative optimization based on feedback, and a smart memory agent for maintaining short-term context.
arXiv Detail & Related papers (2025-08-07T03:33:59Z)
Agent0: Leveraging LLM Agents to Discover Multi-value Features from Text for Enhanced Recommendations [0.0]
Large language models (LLMs) and their associated agent-based frameworks have significantly advanced automated information extraction.<n>This paper presents Agent0, an agent-based system designed to automate information extraction and feature construction from raw, unstructured text.
arXiv Detail & Related papers (2025-07-25T06:45:10Z)
WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization [68.46693401421923]
WebShaper systematically formalizes IS tasks through set theory.<n>WebShaper achieves state-of-the-art performance among open-sourced IS agents on GAIA and WebWalkerQA benchmarks.
arXiv Detail & Related papers (2025-07-20T17:53:37Z)
Deep Research Agents: A Systematic Examination And Roadmap [79.04813794804377]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z)
Large Language Models are Good Relational Learners [55.40941576497973]
We introduce Rel-LLM, a novel architecture that utilizes a graph neural network (GNN)- based encoder to generate structured relational prompts for large language models (LLMs)<n>Unlike traditional text-based serialization approaches, our method preserves the inherent relational structure of databases while enabling LLMs to process and reason over complex entity relationships.
arXiv Detail & Related papers (2025-06-06T04:07:55Z)
RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs [3.41612427812159]
In digital content creation tools, users express their needs through natural language queries that must be mapped to API calls.<n>Existing approaches to synthetic data generation fail to replicate real-world data distributions.<n>We present a novel router-based architecture that generates high-quality synthetic training data.
arXiv Detail & Related papers (2025-05-15T16:53:45Z)
Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI [11.859180018313147]
We propose a 'blueprint architecture' for compound AI systems for orchestrating agents and data for enterprise applications.<n>Existing proprietary models and APIs in the enterprise are mapped to 'agents', defined in an 'agent registry'<n>Agents can utilize proprietary data through a 'data registry' that similarly registers enterprise data of various modalities.
arXiv Detail & Related papers (2025-04-10T22:19:41Z)
AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction [10.65417796726349]
relation extraction (RE) in complex scenarios faces challenges such as diverse relation types and ambiguous relations between entities within a single sentence. We propose an agent-based RE framework, namely AgentRE, which fully leverages the potential of large language models to achieve RE in complex scenarios.
arXiv Detail & Related papers (2024-09-03T12:53:05Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
StructGPT: A General Framework for Large Language Model to Reason over Structured Data [117.13986738340027]
We develop an emphIterative Reading-then-Reasoning(IRR) approach for solving question answering tasks based on structured data. Our approach can significantly boost the performance of ChatGPT and achieve comparable performance against the full-data supervised-tuning baselines.
arXiv Detail & Related papers (2023-05-16T17:45:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.