ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering
- URL: http://arxiv.org/abs/2210.03849v1
- Date: Fri, 7 Oct 2022 23:48:50 GMT
- Title: ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering
- Authors: Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah,
William Yang Wang
- Abstract summary: We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
- Score: 70.6359636116848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the recent advance in large pre-trained language models, researchers
have achieved record performances in NLP tasks that mostly focus on language
pattern matching. The community is experiencing the shift of the challenge from
how to model language to the imitation of complex reasoning abilities like
human beings. In this work, we investigate the application domain of finance
that involves real-world, complex numerical reasoning. We propose a new
large-scale dataset, ConvFinQA, aiming to study the chain of numerical
reasoning in conversational question answering. Our dataset poses great
challenge in modeling long-range, complex numerical reasoning paths in
real-world conversations. We conduct comprehensive experiments and analyses
with both the neural symbolic methods and the prompting-based methods, to
provide insights into the reasoning mechanisms of these two divisions. We
believe our new dataset should serve as a valuable resource to push forward the
exploration of real-world, complex reasoning tasks as the next research focus.
Our dataset and code is publicly available at
https://github.com/czyssrs/ConvFinQA.
Related papers
- Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales.
A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z) - Surveying the Landscape of Text Summarization with Deep Learning: A
Comprehensive Review [2.4185510826808487]
Deep learning has revolutionized natural language processing (NLP) by enabling the development of models that can learn complex representations of language data.
Deep learning models for NLP typically use large amounts of data to train deep neural networks, allowing them to learn the patterns and relationships in language data.
Applying deep learning to text summarization refers to the use of deep neural networks to perform text summarization tasks.
arXiv Detail & Related papers (2023-10-13T21:24:37Z) - What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models [22.0839948292609]
We introduce a novel dataset, C-VQA, specifically designed to test the counterfactual reasoning capabilities of modern language models.
This dataset is constructed by infusing original questions with various types such as numerical and counter-language queries.
Our evaluations of contemporary vision models using this dataset have revealed substantial performance drops, with some models showing up to a 40% decrease.
arXiv Detail & Related papers (2023-10-10T13:45:59Z) - Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors [58.340159346749964]
We propose a new neural-symbolic method to support end-to-end learning using complex queries with provable reasoning capability.
We develop a new dataset containing ten new types of queries with features that have never been considered.
Our method outperforms previous methods significantly in the new dataset and also surpasses previous methods in the existing dataset at the same time.
arXiv Detail & Related papers (2023-04-14T11:35:35Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - FinQA: A Dataset of Numerical Reasoning over Financial Data [52.7249610894623]
We focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents.
We propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts.
The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge.
arXiv Detail & Related papers (2021-09-01T00:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.