FinStat2SQL: A Text2SQL Pipeline for Financial Statement Analysis
- URL: http://arxiv.org/abs/2506.23273v1
- Date: Sun, 29 Jun 2025 14:55:21 GMT
- Title: FinStat2SQL: A Text2SQL Pipeline for Financial Statement Analysis
- Authors: Quang Hung Nguyen, Phuong Anh Trinh, Phan Quoc Hung Mai, Tuan Phong Trinh,
- Abstract summary: FinStat2 is a lightweight text2sql pipeline enabling natural language queries over financial statements.<n>We build a domain-specific database and evaluate models on a synthetic QA.<n>A fine-tuned 7B model achieves 61.33% accuracy with sub-4-second response times on consumer hardware.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the advancements of large language models, text2sql still faces many challenges, particularly with complex and domain-specific queries. In finance, database designs and financial reporting layouts vary widely between financial entities and countries, making text2sql even more challenging. We present FinStat2SQL, a lightweight text2sql pipeline enabling natural language queries over financial statements. Tailored to local standards like VAS, it combines large and small language models in a multi-agent setup for entity extraction, SQL generation, and self-correction. We build a domain-specific database and evaluate models on a synthetic QA dataset. A fine-tuned 7B model achieves 61.33\% accuracy with sub-4-second response times on consumer hardware, outperforming GPT-4o-mini. FinStat2SQL offers a scalable, cost-efficient solution for financial analysis, making AI-powered querying accessible to Vietnamese enterprises.
Related papers
- FinAI Data Assistant: LLM-based Financial Database Query Processing with the OpenAI Function Calling API [1.1985612872852671]
FinAI Data Assistant is a practical approach for natural-ahead querying over financial databases.<n>System routes user requests to a small library of vetted, parameterized queries.<n>Result: Ticker-mapping accuracy is near-perfect for NASDAQ-100 and high for S&P500 firms.
arXiv Detail & Related papers (2025-10-15T23:19:27Z) - FINCH: Financial Intelligence using Natural language for Contextualized SQL Handling [1.8679829796354372]
We introduce a curated financial dataset (FINCH) comprising 292 tables and 75,725 natural language-based pairs.<n>We benchmark reasoning models and language models of varying scales, providing a systematic analysis of their strengths and limitations.<n>Finally, we propose a finance-oriented evaluation metric (FINCH Score) that captures nuances overlooked by existing measures.
arXiv Detail & Related papers (2025-10-02T10:55:11Z) - MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation [89.73542209537148]
MultiFinBen is the first multilingual and multimodal benchmark tailored to the global financial domain.<n>We introduce two novel tasks, including EnglishOCR and SpanishOCR, the first OCR-embedded financial QA tasks.<n>We propose a dynamic, difficulty-aware selection mechanism and curate a compact, balanced benchmark.
arXiv Detail & Related papers (2025-06-16T22:01:49Z) - Structuring the Unstructured: A Multi-Agent System for Extracting and Querying Financial KPIs and Guidance [54.25184684077833]
We propose an efficient and scalable method for extracting quantitative insights from unstructured financial documents.<n>Our proposed system consists of two specialized agents: the emphExtraction Agent and the emphText-to-Agent
arXiv Detail & Related papers (2025-05-25T15:45:46Z) - BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain [4.671854744910768]
We propose a new large-scale Text-to- dataset for the accounting and financial domain: Book.
The dataset consists of 100k natural language queries- pairs, and accounting databases of 1 million records.
arXiv Detail & Related papers (2024-06-12T04:22:27Z) - CHESS: Contextual Harnessing for Efficient SQL Synthesis [1.9506402593665235]
We introduce CHESS, a framework for efficient and scalable text-to- queries.
It comprises four specialized agents, each targeting one of the aforementioned challenges.
Our framework offers features that adapt to various deployment constraints.
arXiv Detail & Related papers (2024-05-27T01:54:16Z) - TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring [11.78795632771211]
We introduce a novel benchmark designed to evaluate text-to- reliability as a model's ability to correctly handle any type of input question.
We evaluate existing methods using a novel penalty-based scoring metric with two modeling approaches.
arXiv Detail & Related papers (2024-03-23T16:12:52Z) - FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial
Analysis [28.514754357658482]
There is no practical Text-to- benchmark dataset for financial analysis.
We propose a model-agnostic Large Language Model (LLMs) for financial analysis.
arXiv Detail & Related papers (2024-01-19T05:48:07Z) - Enhancing Text-to-SQL Translation for Financial System Design [5.248014305403357]
We consider Large Language Models (LLMs), which have achieved state of the art for various NLP tasks.
We propose two novel metrics that were designed to adequately measure the similarity between relational queries.
arXiv Detail & Related papers (2023-12-22T14:34:19Z) - Text2Analysis: A Benchmark of Table Question Answering with Advanced
Data Analysis and Unclear Queries [67.0083902913112]
We develop the Text2Analysis benchmark, incorporating advanced analysis tasks.
We also develop five innovative and effective annotation methods.
We evaluate five state-of-the-art models using three different metrics.
arXiv Detail & Related papers (2023-12-21T08:50:41Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - UNITE: A Unified Benchmark for Text-to-SQL Evaluation [72.72040379293718]
We introduce a UNIfied benchmark for Text-to-domain systems.
It is composed of publicly available text-to-domain datasets and 29K databases.
Compared to the widely used Spider benchmark, we introduce a threefold increase in SQL patterns.
arXiv Detail & Related papers (2023-05-25T17:19:52Z) - SPSQL: Step-by-step Parsing Based Framework for Text-to-SQL Generation [13.196264569882777]
The current mainstream end-to-end Text2 model is not only difficult to build due to its complex structure and high requirements for training data, but also difficult to adjust due to massive parameters.
This paper proposes a pipeline method: SP Experiments to achieve the desired result.
We construct the dataset based on the marketing business data of the State Grid Corporation of China.
arXiv Detail & Related papers (2023-05-10T10:01:36Z) - A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future
Directions [102.8606542189429]
The goal of text-to-corpora parsing is to convert a natural language (NL) question to its corresponding structured query language () based on the evidences provided by databases.
Deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output query.
arXiv Detail & Related papers (2022-08-29T14:24:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.