PACIFIC: Towards Proactive Conversational Question Answering over
Tabular and Textual Data in Finance
- URL: http://arxiv.org/abs/2210.08817v2
- Date: Sun, 19 Mar 2023 03:39:16 GMT
- Title: PACIFIC: Towards Proactive Conversational Question Answering over
Tabular and Textual Data in Finance
- Authors: Yang Deng, Wenqiang Lei, Wenxuan Zhang, Wai Lam, Tat-Seng Chua
- Abstract summary: We present a new dataset, named PACIFIC. Compared with existing CQA datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical reasoning, and (iii) hybrid context of tables and text.
A new task is defined accordingly to study Proactive Conversational Question Answering (PCQA), which combines clarification question generation and CQA.
UniPCQA performs multi-task learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy to alleviate the error propagation issue in the multi-task learning by cross-validating top-$k$ sampled Seq2Seq
- Score: 96.06505049126345
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To facilitate conversational question answering (CQA) over hybrid contexts in
finance, we present a new dataset, named PACIFIC. Compared with existing CQA
datasets, PACIFIC exhibits three key features: (i) proactivity, (ii) numerical
reasoning, and (iii) hybrid context of tables and text. A new task is defined
accordingly to study Proactive Conversational Question Answering (PCQA), which
combines clarification question generation and CQA. In addition, we propose a
novel method, namely UniPCQA, to adapt a hybrid format of input and output
content in PCQA into the Seq2Seq problem, including the reformulation of the
numerical reasoning process as code generation. UniPCQA performs multi-task
learning over all sub-tasks in PCQA and incorporates a simple ensemble strategy
to alleviate the error propagation issue in the multi-task learning by
cross-validating top-$k$ sampled Seq2Seq outputs. We benchmark the PACIFIC
dataset with extensive baselines and provide comprehensive evaluations on each
sub-task of PCQA.
Related papers
- MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering [3.1651118728570635]
In today's fast-paced industry, professionals face the challenge of summarizing a large number of documents and extracting vital information from them on a daily basis.
To address this challenge, the approach of Table Question Answering (QA) has been developed to extract the relevant information.
Recent advancements in Large Language Models (LLMs) have opened up new possibilities for extracting information from tabular data using prompts.
arXiv Detail & Related papers (2024-03-28T03:14:18Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - Summarizing Community-based Question-Answer Pairs [5.680726650578754]
We propose the novel CQA summarization task that aims to create a concise summary from CQA pairs.
Our data and code are publicly available.
arXiv Detail & Related papers (2022-11-17T21:09:41Z) - QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse.
We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z) - TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and
Textual Content in Finance [71.76018597965378]
We build a new large-scale Question Answering dataset containing both Tabular And Textual data, named TAT-QA.
We propose a novel QA model termed TAGOP, which is capable of reasoning over both tables and text.
arXiv Detail & Related papers (2021-05-17T06:12:06Z) - FeTaQA: Free-form Table Question Answering [33.018256483762386]
We introduce FeTaQA, a new dataset with 10K Wikipedia-based table, question, free-form answer, supporting table cells pairs.
FeTaQA yields a more challenging table question answering setting because it requires generating free-form text answers after retrieval, inference, and integration of multiple discontinuous facts from a structured knowledge source.
arXiv Detail & Related papers (2021-04-01T09:59:40Z) - TSQA: Tabular Scenario Based Question Answering [14.92495213480887]
scenario-based question answering (SQA) has attracted an increasing research interest.
To support the study of this task, we construct GeoTSQA.
We extend state-of-the-art MRC methods with TTGen, a novel table-to-text generator.
arXiv Detail & Related papers (2021-01-14T02:00:33Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - KQA Pro: A Dataset with Explicit Compositional Programs for Complex
Question Answering over Knowledge Base [67.87878113432723]
We introduce KQA Pro, a dataset for Complex KBQA including 120K diverse natural language questions.
For each question, we provide the corresponding KoPL program and SPARQL query, so that KQA Pro serves for both KBQA and semantic parsing tasks.
arXiv Detail & Related papers (2020-07-08T03:28:04Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.