DocFinQA: A Long-Context Financial Reasoning Dataset
- URL: http://arxiv.org/abs/2401.06915v2
- Date: Thu, 29 Feb 2024 19:55:14 GMT
- Title: DocFinQA: A Long-Context Financial Reasoning Dataset
- Authors: Varshini Reddy, Rik Koncel-Kedziorski, Viet Dac Lai, Michael Krumdick,
Charles Lovering, Chris Tanner
- Abstract summary: We introduce a long-document financial QA task.
We extend the average context length from under 700 words in FinQA to 123k words in DocFinQA.
We conduct extensive experiments over retrieval-based QA pipelines and long-context language models.
- Score: 17.752081303855263
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: For large language models (LLMs) to be effective in the financial domain --
where each decision can have a significant impact -- it is necessary to
investigate realistic tasks and data. Financial professionals often interact
with documents that are hundreds of pages long, but most financial research
datasets only deal with short excerpts from these documents. To address this,
we introduce a long-document financial QA task. We augment 7,437 questions from
the existing FinQA dataset with the full-document context, extending the
average context length from under 700 words in FinQA to 123k words in DocFinQA.
We conduct extensive experiments over retrieval-based QA pipelines and
long-context language models. DocFinQA proves a significant challenge for even
state-of-the-art systems. We also provide a case-study on the longest documents
in DocFinQA and find that models particularly struggle on these documents.
Addressing these challenges may have a wide reaching impact across applications
where specificity and long-range contexts are critical, like gene sequences and
legal document contract analysis.
Related papers
- SEC-QA: A Systematic Evaluation Corpus for Financial QA [12.279234447220155]
Existing datasets are often constrained by size, context, or relevance to practical applications.
We propose SEC-QA, a continuous dataset generation framework with two key features.
We introduce a QA system based on program-of-thought that improves the ability to perform complex information retrieval and quantitative reasoning pipelines.
arXiv Detail & Related papers (2024-06-20T15:12:41Z) - DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models [63.466265039007816]
We present DocGenome, a structured document benchmark constructed by annotating 500K scientific documents from 153 disciplines in the arXiv open-access community.
We conduct extensive experiments to demonstrate the advantages of DocGenome and objectively evaluate the performance of large models on our benchmark.
arXiv Detail & Related papers (2024-06-17T15:13:52Z) - Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model [22.07414287186125]
Quest is a query-centric data method aggregating semantically relevant yet diverse documents.
It uses a generative model to predict potential queries for each document, grouping documents with similar queries and keywords.
Experiments demonstrate Quest's superior performance on long-context tasks, achieving remarkable results with context lengths of up to 1M tokens.
arXiv Detail & Related papers (2024-05-30T08:50:55Z) - LongFin: A Multimodal Document Understanding Model for Long Financial
Domain Documents [4.924255992661131]
We introduce LongFin, a multimodal document AI model capable of encoding up to 4K tokens.
We also propose the LongForms dataset that encapsulates several industrial challenges in financial documents.
arXiv Detail & Related papers (2024-01-26T18:23:45Z) - PDFTriage: Question Answering over Long, Structured Documents [60.96667912964659]
Representing structured documents as plain text is incongruous with the user's mental model of these documents with rich structure.
We propose PDFTriage that enables models to retrieve the context based on either structure or content.
Our benchmark dataset consists of 900+ human-generated questions over 80 structured documents.
arXiv Detail & Related papers (2023-09-16T04:29:05Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - Towards Complex Document Understanding By Discrete Reasoning [77.91722463958743]
Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language.
We introduce a new Document VQA dataset, named TAT-DQA, which consists of 3,067 document pages and 16,558 question-answer pairs.
We develop a novel model named MHST that takes into account the information in multi-modalities, including text, layout and visual image, to intelligently address different types of questions.
arXiv Detail & Related papers (2022-07-25T01:43:19Z) - FETILDA: An Effective Framework For Fin-tuned Embeddings For Long
Financial Text Documents [14.269860621624394]
We propose and implement a deep learning framework that splits long documents into chunks and utilize pre-trained LMs to process and aggregate the chunks into vector representations.
We evaluate our framework on a collection of 10-K public disclosure reports from US banks, and another dataset of reports submitted by US companies.
arXiv Detail & Related papers (2022-06-14T16:14:14Z) - FinQA: A Dataset of Numerical Reasoning over Financial Data [52.7249610894623]
We focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents.
We propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts.
The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge.
arXiv Detail & Related papers (2021-09-01T00:08:14Z) - Explaining Relationships Between Scientific Documents [55.23390424044378]
We address the task of explaining relationships between two scientific documents using natural language text.
In this paper we establish a dataset of 622K examples from 154K documents.
arXiv Detail & Related papers (2020-02-02T03:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.