Structure First, Reason Next: Enhancing a Large Language Model using Knowledge Graph for Numerical Reasoning in Financial Documents
- URL: http://arxiv.org/abs/2601.07754v1
- Date: Mon, 12 Jan 2026 17:39:08 GMT
- Title: Structure First, Reason Next: Enhancing a Large Language Model using Knowledge Graph for Numerical Reasoning in Financial Documents
- Authors: Aryan Mishra, Akash Anil,
- Abstract summary: Large Language Models (LLMs) have shown promising results in multiple Question-Answering (Q-A) systems.<n>Structured data augmentations, such as Knowledge Graphs (KGs), have notably improved the predictions of LLMs.<n>This paper proposes a framework to incorporate structured information using KGs along with LLM predictions for numerical reasoning tasks.
- Score: 0.21485350418225244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerical reasoning is an important task in the analysis of financial documents. It helps in understanding and performing numerical predictions with logical conclusions for the given query seeking answers from financial texts. Recently, Large Language Models (LLMs) have shown promising results in multiple Question-Answering (Q-A) systems with the capability of logical reasoning. As documents related to finance often consist of long and complex financial contexts, LLMs appear well-suited for building high-quality automated financial question-answering systems. However, LLMs often face challenges in accurately processing the various numbers within financial reports. Extracting numerical data from unstructured text and semi-structured tables, and reliably performing accurate calculations, remains a significant bottleneck for numerical reasoning in most state-of-the-art LLMs. Recent studies have shown that structured data augmentations, such as Knowledge Graphs (KGs), have notably improved the predictions of LLMs along with logical explanations. Thus, it is an important requirement to consider inherent structured information in financial reports while using LLMs for various financial analytics. This paper proposes a framework to incorporate structured information using KGs along with LLM predictions for numerical reasoning tasks. The KGs are extracted using a proposed schema inherently from the document under processing. We evaluated our proposed framework over the benchmark data FinQA, using an open-source LLM, namely Llama 3.1 8B Instruct. We observed that the proposed framework improved execution accuracy by approximately 12% relative to the vanilla LLM.
Related papers
- FinSight: Towards Real-World Financial Deep Research [68.31086471310773]
FinSight is a novel framework for producing high-quality, multimodal financial reports.<n>To ensure professional-grade visualization, we propose an Iterative Vision-Enhanced Mechanism.<n>A two-stage Writing Framework expands concise Chain-of-Analysis segments into coherent, citation-aware, and multimodal reports.
arXiv Detail & Related papers (2025-10-19T14:05:35Z) - FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs [40.216867348210265]
FinAuditing is the first taxonomy-aligned, structure-aware, multi-document benchmark for evaluating financial auditing tasks.<n>Built from real US-compliant.<n> filings, FinAuditing defines three complementary subtasks, FinSM for semantic consistency, FinRE for relational consistency, and FinMR for numerical consistency.<n>Extensive zero-shot experiments on 13 state-of-the-art LLMs reveal that current models perform inconsistently across semantic, relational, and mathematical dimensions.
arXiv Detail & Related papers (2025-10-10T00:41:55Z) - FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering [57.43420753842626]
FinLFQA is a benchmark designed to evaluate the ability of Large Language Models to generate long-form answers to complex financial questions.<n>We provide an automatic evaluation framework covering both answer quality and attribution quality.
arXiv Detail & Related papers (2025-10-07T20:06:15Z) - FinAgentBench: A Benchmark Dataset for Agentic Retrieval in Financial Question Answering [57.18367828883773]
FinAgentBench is a benchmark for evaluating agentic retrieval with multi-step reasoning in finance.<n>The benchmark consists of 26K expert-annotated examples on S&P-500 listed firms.<n>We evaluate a suite of state-of-the-art models and demonstrate how targeted fine-tuning can significantly improve agentic retrieval performance.
arXiv Detail & Related papers (2025-08-07T22:15:22Z) - FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in Finance [3.565466729914703]
Hallucination remains a critical challenge for deploying Large Language Models (LLMs) in finance.<n>We develop a rigorous and scalable framework for evaluating intrinsic hallucinations in financial LLMs.<n>Our work serves as a critical step toward building more trustworthy and reliable financial Generative AI systems.
arXiv Detail & Related papers (2025-08-07T09:37:14Z) - RAG-IT: Retrieval-Augmented Instruction Tuning for Automated Financial Analysis [1.2891210250935148]
RAG-IT (Retrieval-Augmented Instruction Tuning) is a novel framework designed to automate the generation of earnings report analyses.<n>Our approach integrates retrieval augmentation with instruction-based fine-tuning to enhance factual accuracy, contextual relevance, and domain adaptability.
arXiv Detail & Related papers (2024-12-11T08:09:42Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering [54.486757407849915]
This study explores Large Language Models' mathematical reasoning on four financial question-answering datasets.<n>We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps.<n>We introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance.
arXiv Detail & Related papers (2024-02-17T05:10:18Z) - Large Language Model Adaptation for Financial Sentiment Analysis [2.0499240875882]
Generalist language models tend to fall short in tasks specifically tailored for finance.
Two foundation models with less than 1.5B parameters have been adapted using a wide range of strategies.
We show that small LLMs have comparable performance to larger scale models, while being more efficient in terms of parameters and data.
arXiv Detail & Related papers (2024-01-26T11:04:01Z) - Data-Centric Financial Large Language Models [27.464319154543173]
Large language models (LLMs) show promise for natural language tasks but struggle when applied directly to complex domains like finance.
We propose a data-centric approach to enable LLMs to better handle financial tasks.
arXiv Detail & Related papers (2023-10-07T04:53:31Z) - PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark
for Finance [63.51545277822702]
PIXIU is a comprehensive framework including the first financial large language model (LLMs) based on fine-tuning LLaMA with instruction data.
We propose FinMA by fine-tuning LLaMA with the constructed dataset to be able to follow instructions for various financial tasks.
We conduct a detailed analysis of FinMA and several existing LLMs, uncovering their strengths and weaknesses in handling critical financial tasks.
arXiv Detail & Related papers (2023-06-08T14:20:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.