Related papers: FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database

FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database

URL: http://arxiv.org/abs/2501.12399v1
Date: Wed, 08 Jan 2025 07:50:50 GMT
Title: FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database
Authors: Shijie Han, Changhai Zhou, Yiqing Shen, Tianning Sun, Yuhua Zhou, Xiaoxia Wang, Zhixiao Yang, Jingshu Zhang, Hongguang Li,
Abstract summary: FinSphere is a conversational stock analysis agent.<n>An integrated framework combines real-time data feeds, quantitative tools, and an instruction-tuned LLM.
Score: 7.268553732731626
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current financial Large Language Models (LLMs) struggle with two critical limitations: a lack of depth in stock analysis, which impedes their ability to generate professional-grade insights, and the absence of objective evaluation metrics to assess the quality of stock analysis reports. To address these challenges, this paper introduces FinSphere, a conversational stock analysis agent, along with three major contributions: (1) Stocksis, a dataset curated by industry experts to enhance LLMs' stock analysis capabilities, (2) AnalyScore, a systematic evaluation framework for assessing stock analysis quality, and (3) FinSphere, an AI agent that can generate high-quality stock analysis reports in response to user queries. Experiments demonstrate that FinSphere achieves superior performance compared to both general and domain-specific LLMs, as well as existing agent-based systems, even when they are enhanced with real-time data access and few-shot guidance. The integrated framework, which combines real-time data feeds, quantitative tools, and an instruction-tuned LLM, yields substantial improvements in both analytical quality and practical applicability for real-world stock analysis.

Related papers

Integrating Large Language Models in Financial Investments and Market Analysis: A Survey [39.58317527488534]
Large Language Models (LLMs) have been employed in financial decision making, enhancing analytical capabilities for investment strategies.<n>This study provides a structured review of recent LLMs research on applications in stock selection, risk assessment, sentiment analysis, trading, and financial forecasting.
arXiv Detail & Related papers (2025-06-29T05:25:31Z)
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis [60.32962597618861]
IDA-Bench is a novel benchmark evaluating large language models in multi-round interactive scenarios.<n>Agent performance is judged by comparing its final numerical output to the human-derived baseline.<n>Even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on 50% of the tasks, highlighting limitations not evident in single-turn tests.
arXiv Detail & Related papers (2025-05-23T09:37:52Z)
Towards Competent AI for Fundamental Analysis in Finance: A Benchmark Dataset and Evaluation [3.077814260904367]
We propose FinAR-Bench, a benchmark dataset focusing on financial statement analysis.<n>We break this task into three measurable steps: extracting key information, calculating financial indicators, and applying logical reasoning.<n>Our findings offer a clear understanding of LLMs current strengths and limitations in fundamental analysis.
arXiv Detail & Related papers (2025-05-22T07:06:20Z)
Advanced Deep Learning Techniques for Analyzing Earnings Call Transcripts: Methodologies and Applications [0.0]
The objective is to investigate how Natural Language Processing can be leveraged to extract sentiment from large-scale financial transcripts. We examine the strengths and limitations of each model in the context of financial sentiment analysis. Through rigorous experimentation, we evaluate their performance using key metrics, including accuracy, precision, recall, and F1-score.
arXiv Detail & Related papers (2025-02-27T00:28:43Z)
Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework [61.38174427966444]
Large Language Models (LLMs) are being used more and more extensively for automated evaluation in various scenarios. Previous studies have attempted to fine-tune open-source LLMs to replicate the evaluation explanations and judgments of powerful proprietary models. We propose a novel evaluation framework, ARJudge, that adaptively formulates evaluation criteria and synthesizes both text-based and code-driven analyses.
arXiv Detail & Related papers (2025-02-26T06:31:45Z)
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation [79.09622602860703]
We introduce InsightBench, a benchmark dataset with three key features. It consists of 100 datasets representing diverse business use cases such as finance and incident management. Unlike existing benchmarks focusing on answering single queries, InsightBench evaluates agents based on their ability to perform end-to-end data analytics.
arXiv Detail & Related papers (2024-07-08T22:06:09Z)
AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework [48.3060010653088]
We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data. We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
arXiv Detail & Related papers (2024-03-19T09:45:33Z)
FinBen: A Holistic Financial Benchmark for Large Language Models [75.09474986283394]
FinBen is the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading.
arXiv Detail & Related papers (2024-02-20T02:16:16Z)
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents [74.16170899755281]
We introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents.<n>AgentBoard offers a fine-grained progress rate metric that captures incremental advancements as well as a comprehensive evaluation toolkit.<n>This not only sheds light on the capabilities and limitations of LLM agents but also propels the interpretability of their performance to the forefront.
arXiv Detail & Related papers (2024-01-24T01:51:00Z)
FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models [26.99936434072108]
textttFinDABench is a benchmark designed to evaluate the financial data analysis capabilities of Large Language Models. textttFinDABench aims to provide a measure for in-depth analysis of LLM abilities.
arXiv Detail & Related papers (2024-01-01T15:26:23Z)
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models [11.154814189699735]
Large Language Models (LLMs) pre-trained on extensive corpora have demonstrated superior performance across various NLP tasks. We introduce a retrieval-augmented LLMs framework for financial sentiment analysis. Our approach achieves 15% to 48% performance gain in accuracy and F1 score.
arXiv Detail & Related papers (2023-10-06T05:40:23Z)
Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks. We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.