Related papers: AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework

AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework

URL: http://arxiv.org/abs/2403.12582v1
Date: Tue, 19 Mar 2024 09:45:33 GMT
Title: AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework
Authors: Xiang Li, Zhenyu Li, Chen Shi, Yong Xu, Qing Du, Mingkui Tan, Jun Huang, Wei Lin,
Abstract summary: We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data. We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
Score: 48.3060010653088
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering. Currently, machine learning and deep learning algorithms (ML&DL) have been widely applied for stock trend predictions, leading to significant progress. However, these methods fail to provide reasons for predictions, lacking interpretability and reasoning processes. Also, they can not integrate textual information such as financial news or reports. Meanwhile, large language models (LLMs) have remarkable textual understanding and generation ability. But due to the scarcity of financial training datasets and limited integration with real-time knowledge, LLMs still suffer from hallucinations and are unable to keep up with the latest information. To tackle these challenges, we first release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data. It has a positive impact on training LLMs for completing financial analysis. We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task, which integrates retrieval-augmented generation (RAG) techniques. Extensive experiments are conducted to demonstrate the effectiveness of our framework on financial analysis.

Related papers

FinKario: Event-Enhanced Automated Construction of Financial Knowledge Graph [15.846545572924834]
Large language models (LLMs) can enhance investors' decision-making capabilities and strengthen financial analysis.<n>We introduce the Event-Enhanced Automated Construction of Financial Knowledge Graph (FinKario)<n>FinKario automatically integrates real-time company fundamentals and market events through prompt-driven extraction.<n>We propose a Two-Stage, Graph-Based retrieval strategy (FinKario-RAG) to optimize the retrieval of evolving, large-scale financial knowledge.
arXiv Detail & Related papers (2025-08-01T13:27:35Z)
Bridging Language Models and Financial Analysis [49.361943182322385]
The rapid advancements in Large Language Models (LLMs) have unlocked transformative possibilities in natural language processing. Financial data is often embedded in intricate relationships across textual content, numerical tables, and visual charts. Despite the fast pace of innovation in LLM research, there remains a significant gap in their practical adoption within the finance industry.
arXiv Detail & Related papers (2025-03-14T01:35:20Z)
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance [32.516564836540745]
Large language models (LLMs) have shown strong general reasoning capabilities, but their effectiveness in financial reasoning remains underexplored. We evaluate 24 state-of-the-art general and reasoning-focused LLMs across four complex financial reasoning tasks. We propose two domain-adapted models, Fino1-8B and FinoB, trained with chain-of-thought (CoT) fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-02-12T05:13:04Z)
Auto-Generating Earnings Report Analysis via a Financial-Augmented LLM [1.3597551064547502]
This paper presents a novel challenge: developing an LLM specifically for automating the generation of earnings reports analysis. Our methodology involves an in-depth analysis of existing earnings reports followed by a unique approach to fine-tune an LLM for this purpose. With extensive financial documents, we construct financial instruction data, enabling the refined adaptation of our LLM to financial contexts.
arXiv Detail & Related papers (2024-12-11T08:09:42Z)
Large Language Models for Financial Aid in Financial Time-series Forecasting [0.4218593777811082]
Time series forecasting in financial aid is difficult due to limited historical datasets and high dimensional financial information. We use state-of-the-art time series models including pre-trained LLMs (GPT-2 as the backbone), transformers, and linear models to demonstrate their ability to outperform traditional approaches.
arXiv Detail & Related papers (2024-10-24T12:41:47Z)
Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z)
FinBen: A Holistic Financial Benchmark for Large Language Models [75.09474986283394]
FinBen is the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading.
arXiv Detail & Related papers (2024-02-20T02:16:16Z)
FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models [26.99936434072108]
textttFinDABench is a benchmark designed to evaluate the financial data analysis capabilities of Large Language Models. textttFinDABench aims to provide a measure for in-depth analysis of LLM abilities.
arXiv Detail & Related papers (2024-01-01T15:26:23Z)
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models [32.7825479037623]
FinPT is a novel approach for financial risk prediction that conduct Profile Tuning on large pretrained foundation models. FinBench is a set of high-quality datasets on financial risks such as default, fraud, and churn.
arXiv Detail & Related papers (2023-07-22T09:27:05Z)
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models [35.83244096535722]
Large language models (LLMs) have demonstrated remarkable proficiency in understanding and generating human-like texts. Financial Generative Pre-trained Transformer (FinGPT) automates the collection and curation of real-time financial data from 34 diverse sources on the Internet. FinGPT aims to democratize FinLLMs, stimulate innovation, and unlock new opportunities in open finance.
arXiv Detail & Related papers (2023-07-19T22:43:57Z)
Instruct-FinGPT: Financial Sentiment Analysis by Instruction Tuning of General-Purpose Large Language Models [18.212210748797332]
We introduce a simple yet effective instruction tuning approach to address these issues. In the experiment, our approach outperforms state-of-the-art supervised sentiment analysis models.
arXiv Detail & Related papers (2023-06-22T03:56:38Z)
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance [63.51545277822702]
PIXIU is a comprehensive framework including the first financial large language model (LLMs) based on fine-tuning LLaMA with instruction data. We propose FinMA by fine-tuning LLaMA with the constructed dataset to be able to follow instructions for various financial tasks. We conduct a detailed analysis of FinMA and several existing LLMs, uncovering their strengths and weaknesses in handling critical financial tasks.
arXiv Detail & Related papers (2023-06-08T14:20:29Z)
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines. We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z)
FinQA: A Dataset of Numerical Reasoning over Financial Data [52.7249610894623]
We focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents. We propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts. The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge.
arXiv Detail & Related papers (2021-09-01T00:08:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.