THOR: Transformer Heuristics for On-Demand Retrieval
- URL: http://arxiv.org/abs/2507.09592v3
- Date: Thu, 17 Jul 2025 05:47:22 GMT
- Title: THOR: Transformer Heuristics for On-Demand Retrieval
- Authors: Isaac Shi, Zeyuan Li, Fan Liu, Wenli Wang, Lewei He, Yang Yang, Tianyu Shi,
- Abstract summary: We introduce the THOR (Transformer Heuristics for On-Demand Retrieval) Module, designed and implemented by eSapiens.<n>The THOR Module empowers non-language users to access live data with zero-language simplicity and enterprise-grade safety.
- Score: 10.667949307405983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the THOR (Transformer Heuristics for On-Demand Retrieval) Module, designed and implemented by eSapiens, a secure, scalable engine that transforms natural-language questions into verified, read-only SQL analytics for enterprise databases. The Text-to-SQL module follows a decoupled orchestration/execution architecture: a Supervisor Agent routes queries, Schema Retrieval dynamically injects table and column metadata, and a SQL Generation Agent emits single-statement SELECT queries protected by a read-only guardrail. An integrated Self-Correction & Rating loop captures empty results, execution errors, or low-quality outputs and triggers up to five LLM-driven regeneration attempts. Finally, a Result Interpretation Agent produces concise, human-readable insights and hands raw rows to the Insight & Intelligence engine for visualization or forecasting. Smoke tests across finance, sales, and operations scenarios demonstrate reliable ad-hoc querying and automated periodic reporting. By embedding schema awareness, fault-tolerant execution, and compliance guardrails, the THOR Module empowers non-technical users to access live data with zero-SQL simplicity and enterprise-grade safety.
Related papers
- ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement [57.98138819417949]
We propose ErrorLLM, a framework that explicitly models text-to- querying.<n>We show that ErrorLLM achieves the most significant improvements over backbone initial generation.<n>ErrorLLM addresses both sides by high detection F1 score while maintaining refinement effectiveness.
arXiv Detail & Related papers (2026-03-04T05:27:20Z) - From Queries to Insights: Agentic LLM Pipelines for Spatio-Temporal Text-to-SQL [8.496933324334167]
We present a naive text-to-Act baseline (Rellama-sqlcoder-8b) with orchestration by a Mistral-based Rellama-sqlcoder-8b.<n>We evaluate on 35 natural-language queries over the NYC and Tokyo check-in, covering spatial, temporal multi-dataset reasoning.<n>The agent achieves substantially higher accuracy than the dataset 91.4% vs. 28.6% and enhances usability through maps, and plots structured natural-language summaries.
arXiv Detail & Related papers (2025-10-29T22:18:57Z) - FABRIC: Framework for Agent-Based Realistic Intelligence Creation [3.940391073007047]
Large language models (LLMs) are increasingly deployed as agents, expected to decompose goals, invoke tools, and verify results in dynamic environments.<n>We present a unified framework for synthesizing agentic data using only LLMs, without any human-in-the-loop supervision.
arXiv Detail & Related papers (2025-10-20T18:20:22Z) - FACTS: Table Summarization via Offline Template Generation with Agentic Workflows [11.885086835801523]
FACTS produces offline templates, which can be rendered into natural language summaries and are reusable across multiple tables.<n>It enables fast summarization through reusable offline templates, accurate outputs with executablesql queries, and privacy compliance by sending only table schemas to LLMs.
arXiv Detail & Related papers (2025-10-15T10:24:49Z) - MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training [31.290164208264745]
We present MT-R1, an agentic training framework for multi-turn Text-to-the-guided.<n>We cast the task as a Markov Decision Process (MDP) in which an agent interacts with (i) a database for execution feedback and (ii) a persistent dialogue memory for verification.<n>Experiments demonstrate that MT-R1 consistently outperforms strong baselines, highlighting the importance of environment-driven verification and memory-guided refinement for conversational semantic parsing.
arXiv Detail & Related papers (2025-10-12T16:12:05Z) - QueryGym: Step-by-Step Interaction with Relational Databases [30.757678338337055]
We introduce QueryGym, an interactive environment for building, testing, and evaluating LLM-based query planning agents.<n>Existing frameworks often tie agents to specific query language dialects or obscure their reasoning.<n>QueryGym requires agents to construct explicit sequences of relational algebra operations.
arXiv Detail & Related papers (2025-09-25T22:48:49Z) - Enhancing Accuracy and Maintainability in Nuclear Plant Data Retrieval: A Function-Calling LLM Approach Over NL-to-SQL [0.0]
Retrieving operational data from nuclear power plants requires exceptional accuracy and transparency due to the criticality of the decisions it supports.<n>Traditionally, natural language to SQL (NL-to-) approaches have been explored for querying such data.<n>We propose an alternative paradigm: leveraging function-calling large language models (LLMs) to address these challenges.
arXiv Detail & Related papers (2025-06-10T12:55:07Z) - LLM-Symbolic Integration for Robust Temporal Tabular Reasoning [69.27153114778748]
We introduce TempTabQA-C, a synthetic dataset designed for systematic and controlled evaluations.<n>This structured approach allows Large Language Models (LLMs) to generate and executesql queries, enhancing generalization and mitigating biases.
arXiv Detail & Related papers (2025-06-06T05:14:04Z) - SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL [18.493226915913638]
We propose SHARE, an SLM-based Hierarchical Action corREction assistant for text-to-correction.<n> SHARE orchestrates three specialized Small Language Models (SLMs) in a sequential pipeline.<n> Experimental results demonstrate that SHARE effectively enhances self-correction capabilities while proving robust across various LLMs.
arXiv Detail & Related papers (2025-05-31T04:51:12Z) - Weaver: Interweaving SQL and LLM for Table Reasoning [63.09519234853953]
Weaver generates a flexible, step-by-step plan that combinessql for structured data retrieval with LLMs for semantic processing.<n>Weaver consistently outperforms state-of-the-art methods across four TableQA datasets, reducing both API calls and error rates.
arXiv Detail & Related papers (2025-05-25T03:27:37Z) - IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis [60.32962597618861]
IDA-Bench is a novel benchmark evaluating large language models in multi-round interactive scenarios.<n>Agent performance is judged by comparing its final numerical output to the human-derived baseline.<n>Even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on 50% of the tasks, highlighting limitations not evident in single-turn tests.
arXiv Detail & Related papers (2025-05-23T09:37:52Z) - An LLM-Based Approach for Insight Generation in Data Analysis [9.077654650104055]
This paper introduces a novel approach using Large Language Models (LLMs) to automatically generate textual insights.<n>Given a multi-table database as input, our method leverages LLMs to produce concise, text-based insights that reflect interesting patterns in the tables.<n>The insights are evaluated for both correctness and subjective insightfulness using a hybrid model of human judgment and automated metrics.
arXiv Detail & Related papers (2025-02-20T17:09:59Z) - Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning [10.731045939849125]
We focus on Text-to- semantic parsing from the perspective of retrieval-augmented generation.
Motivated by challenges related to the size of commercial database schemata and the deployability of business intelligence solutions, we propose $textASTReS$ that dynamically retrieves input database information.
arXiv Detail & Related papers (2024-07-03T15:55:14Z) - UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics.
We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z) - TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring [11.78795632771211]
We introduce a novel benchmark designed to evaluate text-to- reliability as a model's ability to correctly handle any type of input question.
We evaluate existing methods using a novel penalty-based scoring metric with two modeling approaches.
arXiv Detail & Related papers (2024-03-23T16:12:52Z) - Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with
Sample-aware Prompting and Dynamic Revision Chain [21.593701177605652]
We propose a Text-to-aware prompting framework, involving a sample and a dynamic revision chain.
Our approach incorporates sample demonstrations and fine-grained information related to the given question.
To generate executable and accuratesqls without human intervention, we design a dynamic revision chain which iteratively adapts fine-grained feedback.
arXiv Detail & Related papers (2023-07-11T07:16:22Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Improving Text-to-SQL Semantic Parsing with Fine-grained Query
Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding.
Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.