Related papers: THOR: Transformer Heuristics for On-Demand Retrieval

THOR: Transformer Heuristics for On-Demand Retrieval

URL: http://arxiv.org/abs/2507.09592v3
Date: Thu, 17 Jul 2025 05:47:22 GMT
Title: THOR: Transformer Heuristics for On-Demand Retrieval
Authors: Isaac Shi, Zeyuan Li, Fan Liu, Wenli Wang, Lewei He, Yang Yang, Tianyu Shi,
Abstract summary: We introduce the THOR (Transformer Heuristics for On-Demand Retrieval) Module, designed and implemented by eSapiens.<n>The THOR Module empowers non-language users to access live data with zero-language simplicity and enterprise-grade safety.
Score: 10.667949307405983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce the THOR (Transformer Heuristics for On-Demand Retrieval) Module, designed and implemented by eSapiens, a secure, scalable engine that transforms natural-language questions into verified, read-only SQL analytics for enterprise databases. The Text-to-SQL module follows a decoupled orchestration/execution architecture: a Supervisor Agent routes queries, Schema Retrieval dynamically injects table and column metadata, and a SQL Generation Agent emits single-statement SELECT queries protected by a read-only guardrail. An integrated Self-Correction & Rating loop captures empty results, execution errors, or low-quality outputs and triggers up to five LLM-driven regeneration attempts. Finally, a Result Interpretation Agent produces concise, human-readable insights and hands raw rows to the Insight & Intelligence engine for visualization or forecasting. Smoke tests across finance, sales, and operations scenarios demonstrate reliable ad-hoc querying and automated periodic reporting. By embedding schema awareness, fault-tolerant execution, and compliance guardrails, the THOR Module empowers non-technical users to access live data with zero-SQL simplicity and enterprise-grade safety.

Related papers

ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement [57.98138819417949]
We propose ErrorLLM, a framework that explicitly models text-to- querying.<n>We show that ErrorLLM achieves the most significant improvements over backbone initial generation.<n>ErrorLLM addresses both sides by high detection F1 score while maintaining refinement effectiveness.
arXiv Detail & Related papers (2026-03-04T05:27:20Z)
From Queries to Insights: Agentic LLM Pipelines for Spatio-Temporal Text-to-SQL [8.496933324334167]
We present a naive text-to-Act baseline (Rellama-sqlcoder-8b) with orchestration by a Mistral-based Rellama-sqlcoder-8b.<n>We evaluate on 35 natural-language queries over the NYC and Tokyo check-in, covering spatial, temporal multi-dataset reasoning.<n>The agent achieves substantially higher accuracy than the dataset 91.4% vs. 28.6% and enhances usability through maps, and plots structured natural-language summaries.
arXiv Detail & Related papers (2025-10-29T22:18:57Z)
FABRIC: Framework for Agent-Based Realistic Intelligence Creation [3.940391073007047]
Large language models (LLMs) are increasingly deployed as agents, expected to decompose goals, invoke tools, and verify results in dynamic environments.<n>We present a unified framework for synthesizing agentic data using only LLMs, without any human-in-the-loop supervision.
arXiv Detail & Related papers (2025-10-20T18:20:22Z)
FACTS: Table Summarization via Offline Template Generation with Agentic Workflows [11.885086835801523]
FACTS produces offline templates, which can be rendered into natural language summaries and are reusable across multiple tables.<n>It enables fast summarization through reusable offline templates, accurate outputs with executablesql queries, and privacy compliance by sending only table schemas to LLMs.
arXiv Detail & Related papers (2025-10-15T10:24:49Z)
MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training [31.290164208264745]
We present MT-R1, an agentic training framework for multi-turn Text-to-the-guided.<n>We cast the task as a Markov Decision Process (MDP) in which an agent interacts with (i) a database for execution feedback and (ii) a persistent dialogue memory for verification.<n>Experiments demonstrate that MT-R1 consistently outperforms strong baselines, highlighting the importance of environment-driven verification and memory-guided refinement for conversational semantic parsing.
arXiv Detail & Related papers (2025-10-12T16:12:05Z)
QueryGym: Step-by-Step Interaction with Relational Databases [30.757678338337055]
We introduce QueryGym, an interactive environment for building, testing, and evaluating LLM-based query planning agents.<n>Existing frameworks often tie agents to specific query language dialects or obscure their reasoning.<n>QueryGym requires agents to construct explicit sequences of relational algebra operations.
arXiv Detail & Related papers (2025-09-25T22:48:49Z)
Enhancing Accuracy and Maintainability in Nuclear Plant Data Retrieval: A Function-Calling LLM Approach Over NL-to-SQL [0.0]
Retrieving operational data from nuclear power plants requires exceptional accuracy and transparency due to the criticality of the decisions it supports.<n>Traditionally, natural language to SQL (NL-to-) approaches have been explored for querying such data.<n>We propose an alternative paradigm: leveraging function-calling large language models (LLMs) to address these challenges.
arXiv Detail & Related papers (2025-06-10T12:55:07Z)
LLM-Symbolic Integration for Robust Temporal Tabular Reasoning [69.27153114778748]
We introduce TempTabQA-C, a synthetic dataset designed for systematic and controlled evaluations.<n>This structured approach allows Large Language Models (LLMs) to generate and executesql queries, enhancing generalization and mitigating biases.
arXiv Detail & Related papers (2025-06-06T05:14:04Z)
SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL [18.493226915913638]
We propose SHARE, an SLM-based Hierarchical Action corREction assistant for text-to-correction.<n> SHARE orchestrates three specialized Small Language Models (SLMs) in a sequential pipeline.<n> Experimental results demonstrate that SHARE effectively enhances self-correction capabilities while proving robust across various LLMs.
arXiv Detail & Related papers (2025-05-31T04:51:12Z)
Weaver: Interweaving SQL and LLM for Table Reasoning [63.09519234853953]
Weaver generates a flexible, step-by-step plan that combinessql for structured data retrieval with LLMs for semantic processing.<n>Weaver consistently outperforms state-of-the-art methods across four TableQA datasets, reducing both API calls and error rates.
arXiv Detail & Related papers (2025-05-25T03:27:37Z)
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis [60.32962597618861]
IDA-Bench is a novel benchmark evaluating large language models in multi-round interactive scenarios.<n>Agent performance is judged by comparing its final numerical output to the human-derived baseline.<n>Even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on 50% of the tasks, highlighting limitations not evident in single-turn tests.
arXiv Detail & Related papers (2025-05-23T09:37:52Z)
An LLM-Based Approach for Insight Generation in Data Analysis [9.077654650104055]
This paper introduces a novel approach using Large Language Models (LLMs) to automatically generate textual insights.<n>Given a multi-table database as input, our method leverages LLMs to produce concise, text-based insights that reflect interesting patterns in the tables.<n>The insights are evaluated for both correctness and subjective insightfulness using a hybrid model of human judgment and automated metrics.
arXiv Detail & Related papers (2025-02-20T17:09:59Z)
Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning [10.731045939849125]
We focus on Text-to- semantic parsing from the perspective of retrieval-augmented generation. Motivated by challenges related to the size of commercial database schemata and the deployability of business intelligence solutions, we propose $textASTReS$ that dynamically retrieves input database information.
arXiv Detail & Related papers (2024-07-03T15:55:14Z)
UQE: A Query Engine for Unstructured Databases [71.49289088592842]
We investigate the potential of Large Language Models to enable unstructured data analytics. We propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.
arXiv Detail & Related papers (2024-06-23T06:58:55Z)
TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring [11.78795632771211]
We introduce a novel benchmark designed to evaluate text-to- reliability as a model's ability to correctly handle any type of input question. We evaluate existing methods using a novel penalty-based scoring metric with two modeling approaches.
arXiv Detail & Related papers (2024-03-23T16:12:52Z)
Retrieval-augmented GPT-3.5-based Text-to-SQL Framework with Sample-aware Prompting and Dynamic Revision Chain [21.593701177605652]
We propose a Text-to-aware prompting framework, involving a sample and a dynamic revision chain. Our approach incorporates sample demonstrations and fine-grained information related to the given question. To generate executable and accuratesqls without human intervention, we design a dynamic revision chain which iteratively adapts fine-grained feedback.
arXiv Detail & Related papers (2023-07-11T07:16:22Z)
SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs) With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z)
Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding [84.04706075621013]
We present a general-purpose, modular neural semantic parsing framework based on token-level fine-grained query understanding. Our framework consists of three modules: named entity recognizer (NER), neural entity linker (NEL) and neural entity linker (NSP)
arXiv Detail & Related papers (2022-09-28T21:00:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.