FinQA: A Dataset of Numerical Reasoning over Financial Data
- URL: http://arxiv.org/abs/2109.00122v1
- Date: Wed, 1 Sep 2021 00:08:14 GMT
- Title: FinQA: A Dataset of Numerical Reasoning over Financial Data
- Authors: Zhiyu Chen, Wenhu Chen, Charese Smiley, Sameena Shah, Iana Borova,
Dylan Langdon, Reema Moussa, Matt Beane, Ting-Hao Huang, Bryan Routledge,
William Yang Wang
- Abstract summary: We focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents.
We propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts.
The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge.
- Score: 52.7249610894623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The sheer volume of financial statements makes it difficult for humans to
access and analyze a business's financials. Robust numerical reasoning likewise
faces unique challenges in this domain. In this work, we focus on answering
deep questions over financial data, aiming to automate the analysis of a large
corpus of financial documents. In contrast to existing tasks on general domain,
the finance domain includes complex numerical reasoning and understanding of
heterogeneous representations. To facilitate analytical progress, we propose a
new large-scale dataset, FinQA, with Question-Answering pairs over Financial
reports, written by financial experts. We also annotate the gold reasoning
programs to ensure full explainability. We further introduce baselines and
conduct comprehensive experiments in our dataset. The results demonstrate that
popular, large, pre-trained models fall far short of expert humans in acquiring
finance knowledge and in complex multi-step numerical reasoning on that
knowledge. Our dataset -- the first of its kind -- should therefore enable
significant, new community research into complex application domains. The
dataset and code are publicly available\url{https://github.com/czyssrs/FinQA}.
Related papers
- SEC-QA: A Systematic Evaluation Corpus for Financial QA [12.279234447220155]
Existing datasets are often constrained by size, context, or relevance to practical applications.
We propose SEC-QA, a continuous dataset generation framework with two key features.
We introduce a QA system based on program-of-thought that improves the ability to perform complex information retrieval and quantitative reasoning pipelines.
arXiv Detail & Related papers (2024-06-20T15:12:41Z) - A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges [60.546677053091685]
Large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain.
We explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation.
We highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications.
arXiv Detail & Related papers (2024-06-15T16:11:35Z) - AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework [48.3060010653088]
We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data.
We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
arXiv Detail & Related papers (2024-03-19T09:45:33Z) - FinBen: A Holistic Financial Benchmark for Large Language Models [75.09474986283394]
FinBen is the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks.
FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading.
arXiv Detail & Related papers (2024-02-20T02:16:16Z) - FinLLMs: A Framework for Financial Reasoning Dataset Generation with
Large Language Models [12.367548338910744]
FinLLMs is a method for generating financial question-answering data based on common financial formulas using Large Language Models.
Our experiments demonstrate that synthetic data generated by FinLLMs effectively enhances the performance of several large-scale numerical reasoning models in the financial domain.
arXiv Detail & Related papers (2024-01-19T15:09:39Z) - BizBench: A Quantitative Reasoning Benchmark for Business and Finance [7.4673182865000225]
BizBench is a benchmark for evaluating models' ability to reason about realistic financial problems.
We include three financially-themed code-generation tasks from newly collected and augmented QA data.
These tasks evaluate a model's financial background knowledge, ability to parse financial documents, and capacity to solve problems with code.
arXiv Detail & Related papers (2023-11-11T16:16:11Z) - PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark
for Finance [63.51545277822702]
PIXIU is a comprehensive framework including the first financial large language model (LLMs) based on fine-tuning LLaMA with instruction data.
We propose FinMA by fine-tuning LLaMA with the constructed dataset to be able to follow instructions for various financial tasks.
We conduct a detailed analysis of FinMA and several existing LLMs, uncovering their strengths and weaknesses in handling critical financial tasks.
arXiv Detail & Related papers (2023-06-08T14:20:29Z) - REFinD: Relation Extraction Financial Dataset [7.207699035400335]
We propose REFinD, the first large-scale annotated dataset of relations, with $sim$29K instances and 22 relations amongst 8 types of entity pairs, generated entirely over financial documents.
We observed that various state-of-the-art deep learning models struggle with numeric inference, relational and directional ambiguity.
arXiv Detail & Related papers (2023-05-22T22:40:11Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.