Activity report analysis with automatic single or multispan answer
extraction
- URL: http://arxiv.org/abs/2209.09316v1
- Date: Fri, 9 Sep 2022 06:33:29 GMT
- Title: Activity report analysis with automatic single or multispan answer
extraction
- Authors: Ravi Choudhary, Arvind Krishna Sridhar, Erik Visser
- Abstract summary: We create a new smart home environment dataset comprised of questions paired with single-span or multi-span answers depending on the question and context queried.
Our experiments show that the proposed model outperforms state-of-the-art QA models on our dataset.
- Score: 0.21485350418225244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the era of loT (Internet of Things) we are surrounded by a plethora of Al
enabled devices that can transcribe images, video, audio, and sensors signals
into text descriptions. When such transcriptions are captured in activity
reports for monitoring, life logging and anomaly detection applications, a user
would typically request a summary or ask targeted questions about certain
sections of the report they are interested in. Depending on the context and the
type of question asked, a question answering (QA) system would need to
automatically determine whether the answer covers single-span or multi-span
text components. Currently available QA datasets primarily focus on single span
responses only (such as SQuAD[4]) or contain a low proportion of examples with
multiple span answers (such as DROP[3]). To investigate automatic selection of
single/multi-span answers in the use case described, we created a new smart
home environment dataset comprised of questions paired with single-span or
multi-span answers depending on the question and context queried. In addition,
we propose a RoBERTa[6]-based multiple span extraction question answering
(MSEQA) model returning the appropriate answer span for a given question. Our
experiments show that the proposed model outperforms state-of-the-art QA models
on our dataset while providing comparable performance on published individual
single/multi-span task datasets.
Related papers
- Multi-LLM QA with Embodied Exploration [55.581423861790945]
We investigate the use of Multi-Embodied LLM Explorers (MELE) for question-answering in an unknown environment.
Multiple LLM-based agents independently explore and then answer queries about a household environment.
We analyze different aggregation methods to generate a single, final answer for each query.
arXiv Detail & Related papers (2024-06-16T12:46:40Z) - A Dataset of Open-Domain Question Answering with Multiple-Span Answers [11.291635421662338]
Multi-span answer extraction, also known as the task of multi-span question answering (MSQA), is critical for real-world applications.
There is a notable lack of publicly available MSQA benchmark in Chinese.
We present CLEAN, a comprehensive Chinese multi-span question answering dataset.
arXiv Detail & Related papers (2024-02-15T13:03:57Z) - Long-form Question Answering: An Iterative Planning-Retrieval-Generation
Approach [28.849548176802262]
Long-form question answering (LFQA) poses a challenge as it involves generating detailed answers in the form of paragraphs.
We propose an LFQA model with iterative Planning, Retrieval, and Generation.
We find that our model outperforms the state-of-the-art models on various textual and factual metrics for the LFQA task.
arXiv Detail & Related papers (2023-11-15T21:22:27Z) - SEMQA: Semi-Extractive Multi-Source Question Answering [94.04430035121136]
We introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion.
We create the first dataset of this kind, QuoteSum, with human-written semi-extractive answers to natural and generated questions.
arXiv Detail & Related papers (2023-11-08T18:46:32Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - LIQUID: A Framework for List Question Answering Dataset Generation [17.86721740779611]
We propose LIQUID, an automated framework for generating list QA datasets from unlabeled corpora.
We first convert a passage from Wikipedia or PubMed into a summary and extract named entities from the summarized text as candidate answers.
We then create questions using an off-the-shelf question generator with the extracted entities and original passage.
Using our synthetic data, we significantly improve the performance of the previous best list QA models by exact-match F1 scores of 5.0 on MultiSpanQA, 1.9 on Quoref, and 2.8 averaged across three BioASQ benchmarks.
arXiv Detail & Related papers (2023-02-03T12:42:45Z) - MQAG: Multiple-choice Question Answering and Generation for Assessing
Information Consistency in Summarization [55.60306377044225]
State-of-the-art summarization systems can generate highly fluent summaries.
These summaries, however, may contain factual inconsistencies and/or information not present in the source.
We introduce an alternative scheme based on standard information-theoretic measures in which the information present in the source and summary is directly compared.
arXiv Detail & Related papers (2023-01-28T23:08:25Z) - GooAQ: Open Question Answering with Diverse Answer Types [63.06454855313667]
We present GooAQ, a large-scale dataset with a variety of answer types.
This dataset contains over 5 million questions and 3 million answers collected from Google.
arXiv Detail & Related papers (2021-04-18T05:40:39Z) - ParaQA: A Question Answering Dataset with Paraphrase Responses for
Single-Turn Conversation [5.087932295628364]
ParaQA is a dataset with multiple paraphrased responses for single-turn conversation over knowledge graphs (KG)
The dataset was created using a semi-automated framework for generating diverse paraphrasing of the answers using techniques such as back-translation.
arXiv Detail & Related papers (2021-03-13T18:53:07Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.