An Agent Framework for Real-Time Financial Information Searching with Large Language Models
- URL: http://arxiv.org/abs/2502.15684v1
- Date: Sat, 14 Dec 2024 07:26:39 GMT
- Title: An Agent Framework for Real-Time Financial Information Searching with Large Language Models
- Authors: Jinzheng Li, Jingshu Zhang, Hongguang Li, Yiqing Shen,
- Abstract summary: FinSearch is a novel agent-based search framework specifically designed for financial applications.<n>FinSearch comprises four components: (1) an LLM-based multi-step search pre-planner that decomposes user queries into structured sub-queries mapped to specific data sources through a graph representation; (2) a search executor with an LLM-based adaptive query rewriter that executes the searching of each sub-queries while dynamically refining the sub-queries in its subsequent node based on intermediate search results; and (3) a temporal weighting mechanism that prioritizes information relevance based on the time context from the user's query.
- Score: 8.260170301368758
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Financial decision-making requires processing vast amounts of real-time information while understanding their complex temporal relationships. While traditional search engines excel at providing real-time information access, they often struggle to comprehend sophisticated user intentions and contextual nuances. Conversely, Large Language Models (LLMs) demonstrate reasoning and interaction capabilities but may generate unreliable outputs without access to current data. While recent attempts have been made to combine LLMs with search capabilities, they suffer from (1) restricted access to specialized financial data, (2) static query structures that cannot adapt to dynamic market conditions, and (3) insufficient temporal awareness in result generation. To address these challenges, we present FinSearch, a novel agent-based search framework specifically designed for financial applications that interface with diverse financial data sources including market, stock, and news data. Innovatively, FinSearch comprises four components: (1) an LLM-based multi-step search pre-planner that decomposes user queries into structured sub-queries mapped to specific data sources through a graph representation; (2) a search executor with an LLM-based adaptive query rewriter that executes the searching of each sub-query while dynamically refining the sub-queries in its subsequent node based on intermediate search results; (3) a temporal weighting mechanism that prioritizes information relevance based on the deduced time context from the user's query; (4) an LLM-based response generator that synthesizes results into coherent, contextually appropriate outputs. To evaluate FinSearch, we construct FinSearchBench-24, a benchmark of 1,500 four-choice questions across the stock market, rate changes, monetary policy, and industry developments spanning from June to October 2024.
Related papers
- FinDER: Financial Dataset for Question Answering and Evaluating Retrieval-Augmented Generation [63.55583665003167]
We present FinDER, an expert-generated dataset tailored for Retrieval-Augmented Generation (RAG) in finance.
FinDER focuses on annotating search-relevant evidence by domain experts, offering 5,703 query-evidence-answer triplets.
By challenging models to retrieve relevant information from large corpora, FinDER offers a more realistic benchmark for evaluating RAG systems.
arXiv Detail & Related papers (2025-04-22T11:30:13Z) - FinSage: A Multi-aspect RAG System for Financial Filings Question Answering [7.581619443736712]
FinSage is a multi-modal pre-processing pipeline that unifies diverse data formats and generates metadata summaries.
Experiments demonstrate that FinSage achieves an impressive recall of 92.51% on 75 expert-curated questions.
FinSage has been successfully deployed as financial question-answering agent in online meetings, where it has already served more than 1,200 people.
arXiv Detail & Related papers (2025-04-20T04:58:14Z) - FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance [79.78247299859656]
FinTMMBench is the first comprehensive benchmark for evaluating temporal-aware multi-modal Retrieval-Augmented Generation systems in finance.<n>Built from heterologous data of NASDAQ 100 companies, FinTMMBench offers three significant advantages.
arXiv Detail & Related papers (2025-03-07T07:13:59Z) - FinBloom: Knowledge Grounding Large Language Model with Real-time Financial Data [1.8434042562191815]
We introduce Financial Agent, a knowledge-grounding approach for large language models to handle financial queries.<n>We train FinBloom 7B, a custom 7 billion parameter LLM, on 14 million financial news articles.<n>This agent generates relevant financial context, enabling efficient real-time data retrieval.
arXiv Detail & Related papers (2025-02-04T06:51:34Z) - Multi-Reranker: Maximizing performance of retrieval-augmented generation in the FinanceRAG challenge [5.279257531335345]
This paper details the development of a high-performance, finance-specific Retrieval-Augmented Generation (RAG) system for the ACM-ICAIF '24 FinanceRAG competition.
We optimized performance through ablation studies on query expansion and corpus refinement during the pre-retrieval phase.
Notably, we introduced an efficient method for managing long context sizes during the generation phase, significantly improving response quality without sacrificing performance.
arXiv Detail & Related papers (2024-11-23T09:56:21Z) - Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting [23.61061000692023]
This study proposes leveraging user interactions recorded in search logs to yield insights into users' implicit search intentions.<n>We propose ProRBP, a novel Progressive Retrieved Behavior-augmented Prompting framework for integrating search scenario-oriented knowledge with Large Language Models.
arXiv Detail & Related papers (2024-08-18T11:07:38Z) - SEC-QA: A Systematic Evaluation Corpus for Financial QA [12.279234447220155]
Existing datasets are often constrained by size, context, or relevance to practical applications.
We propose SEC-QA, a continuous dataset generation framework with two key features.
We introduce a QA system based on program-of-thought that improves the ability to perform complex information retrieval and quantitative reasoning pipelines.
arXiv Detail & Related papers (2024-06-20T15:12:41Z) - FinBen: A Holistic Financial Benchmark for Large Language Models [75.09474986283394]
FinBen is the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks.
FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading.
arXiv Detail & Related papers (2024-02-20T02:16:16Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Large Search Model: Redefining Search Stack in the Era of LLMs [63.503320030117145]
We introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM)
All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts.
This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack.
arXiv Detail & Related papers (2023-10-23T05:52:09Z) - Investigating Table-to-Text Generation Capabilities of LLMs in
Real-World Information Seeking Scenarios [32.84523661055774]
Tabular data is prevalent across various industries, necessitating significant time and effort for users to understand and manipulate for their information-seeking purposes.
The adoption of large language models (LLMs) in real-world applications for table information seeking remains underexplored.
This paper investigates the table-to-text capabilities of different LLMs using four datasets within two real-world information seeking scenarios.
arXiv Detail & Related papers (2023-05-24T10:22:30Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - FinQA: A Dataset of Numerical Reasoning over Financial Data [52.7249610894623]
We focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents.
We propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts.
The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge.
arXiv Detail & Related papers (2021-09-01T00:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.