Natural Answer Generation: From Factoid Answer to Full-length Answer
using Grammar Correction
- URL: http://arxiv.org/abs/2112.03849v1
- Date: Tue, 7 Dec 2021 17:39:21 GMT
- Title: Natural Answer Generation: From Factoid Answer to Full-length Answer
using Grammar Correction
- Authors: Manas Jain, Sriparna Saha, Pushpak Bhattacharyya, Gladvin Chinnadurai,
Manish Kumar Vatsa
- Abstract summary: This paper proposes a system that outputs a full-length answer given a question and the extracted factoid answer as the input.
A transformer-based Grammar Error Correction model GECToR ( 2020), is used as a post-processing step for better fluency.
We compare our system with (i) Modified Pointer Generator (SOTA) and (ii) Fine-tuned DialoGPT for factoid questions.
- Score: 39.40116590327074
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Question Answering systems these days typically use template-based language
generation. Though adequate for a domain-specific task, these systems are too
restrictive and predefined for domain-independent systems. This paper proposes
a system that outputs a full-length answer given a question and the extracted
factoid answer (short spans such as named entities) as the input. Our system
uses constituency and dependency parse trees of questions. A transformer-based
Grammar Error Correction model GECToR (2020), is used as a post-processing step
for better fluency. We compare our system with (i) Modified Pointer Generator
(SOTA) and (ii) Fine-tuned DialoGPT for factoid questions. We also test our
approach on existential (yes-no) questions with better results. Our model
generates accurate and fluent answers than the state-of-the-art (SOTA)
approaches. The evaluation is done on NewsQA and SqUAD datasets with an
increment of 0.4 and 0.9 percentage points in ROUGE-1 score respectively. Also
the inference time is reduced by 85\% as compared to the SOTA. The improved
datasets used for our evaluation will be released as part of the research
contribution.
Related papers
- RAG-ConfusionQA: A Benchmark for Evaluating LLMs on Confusing Questions [52.33835101586687]
Conversational AI agents use Retrieval Augmented Generation (RAG) to provide verifiable document-grounded responses to user inquiries.
This paper presents a novel synthetic data generation method to efficiently create a diverse set of context-grounded confusing questions from a given document corpus.
arXiv Detail & Related papers (2024-10-18T16:11:29Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Automatic Speech Recognition System-Independent Word Error Rate Estimation [23.25173244408922]
Word error rate (WER) is a metric used to evaluate the quality of transcriptions produced by Automatic Speech Recognition (ASR) systems.
In this paper, a hypothesis generation method for ASR System-Independent WER estimation is proposed.
arXiv Detail & Related papers (2024-04-25T16:57:05Z) - VANiLLa : Verbalized Answers in Natural Language at Large Scale [2.9098477555578333]
This dataset consists of over 100k simple questions adapted from the CSQA and SimpleQuestionsWikidata datasets.
The answer sentences in this dataset are syntactically and semantically closer to the question than to the triple fact.
arXiv Detail & Related papers (2021-05-24T16:57:54Z) - Stacking Neural Network Models for Automatic Short Answer Scoring [0.0]
We propose the use of a stacking model based on neural network and XGBoost for classification process with sentence embedding feature.
Best model obtained an F1-score of 0.821 exceeding the previous work at the same dataset.
arXiv Detail & Related papers (2020-10-21T16:00:09Z) - Sequence-to-Sequence Learning for Indonesian Automatic Question
Generator [0.0]
We construct an Indonesian automatic question generator, adapting the architecture from some previous works.
The system achieved BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE-L score at 38,35, 20,96, 10,68, 5,78, and 43,4 for SQuAD, and 39.9, 20.78, 10.26, 6.31, 44.13 for TyDiQA.
arXiv Detail & Related papers (2020-09-29T09:25:54Z) - The Paradigm Discovery Problem [121.79963594279893]
We formalize the paradigm discovery problem and develop metrics for judging systems.
We report empirical results on five diverse languages.
Our code and data are available for public use.
arXiv Detail & Related papers (2020-05-04T16:38:54Z) - KPQA: A Metric for Generative Question Answering Using Keyphrase Weights [64.54593491919248]
KPQA-metric is a new metric for evaluating correctness of generative question answering systems.
Our new metric assigns different weights to each token via keyphrase prediction.
We show that our proposed metric has a significantly higher correlation with human judgments than existing metrics.
arXiv Detail & Related papers (2020-05-01T03:24:36Z) - AMR Parsing via Graph-Sequence Iterative Inference [62.85003739964878]
We propose a new end-to-end model that treats AMR parsing as a series of dual decisions on the input sequence and the incrementally constructed graph.
We show that the answers to these two questions are mutually causalities.
We design a model based on iterative inference that helps achieve better answers in both perspectives, leading to greatly improved parsing accuracy.
arXiv Detail & Related papers (2020-04-12T09:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.