Related papers: MeeQA: Natural Questions in Meeting Transcripts

MeeQA: Natural Questions in Meeting Transcripts

URL: http://arxiv.org/abs/2305.08502v1
Date: Mon, 15 May 2023 10:02:47 GMT
Title: MeeQA: Natural Questions in Meeting Transcripts
Authors: Reut Apel, Tom Braude, Amir Kantor, Eyal Kolman
Abstract summary: We present MeeQA, a dataset for natural-language question answering over meeting transcripts. The dataset contains 48K question-answer pairs, extracted from 422 meeting transcripts.
Score: 3.383670923637875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present MeeQA, a dataset for natural-language question answering over meeting transcripts. It includes real questions asked during meetings by its participants. The dataset contains 48K question-answer pairs, extracted from 422 meeting transcripts, spanning multiple domains. Questions in transcripts pose a special challenge as they are not always clear, and considerable context may be required in order to provide an answer. Further, many questions asked during meetings are left unanswered. To improve baseline model performance on this type of questions, we also propose a novel loss function, \emph{Flat Hierarchical Loss}, designed to enhance performance over questions with no answer in the text. Our experiments demonstrate the advantage of using our approach over standard QA models.

Related papers

Which questions should I answer? Salience Prediction of Inquisitive Questions [118.097974193544]
We show that highly salient questions are empirically more likely to be answered in the same article. We further validate our findings by showing that answering salient questions is an indicator of summarization quality in news.
arXiv Detail & Related papers (2024-04-16T21:33:05Z)
AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions [10.272000561545331]
We propose AGent, a novel pipeline that creates new unanswerable questions by re-matching a question with a context that lacks the necessary information for a correct answer. In this paper, we demonstrate the usefulness of this AGent pipeline by creating two sets of unanswerable questions from answerable questions in SQuAD and HotpotQA.
arXiv Detail & Related papers (2023-09-10T18:13:11Z)
Answering Ambiguous Questions with a Database of Questions, Answers, and Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia. Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z)
Keeping the Questions Conversational: Using Structured Representations to Resolve Dependency in Conversational Question Answering [26.997542897342164]
We propose a novel framework, CONVSR (CONVQA using Structured Representations) for capturing and generating intermediate representations as conversational cues. We test our model on the QuAC and CANARD datasets and illustrate by experimental results that our proposed framework achieves a better F1 score than the standard question rewriting model.
arXiv Detail & Related papers (2023-04-14T13:42:32Z)
Conversational QA Dataset Generation with Answer Revision [2.5838973036257458]
We introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. Our framework revises the extracted answers after generating questions so that answers exactly match paired questions.
arXiv Detail & Related papers (2022-09-23T04:05:38Z)
Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections [35.005593397252746]
A key challenge in building and evaluating models for discourse comprehension is the lack of annotated data. This paper presents a novel paradigm that enables scalable data collection targeting the comprehension of news documents. The resulting corpus, DCQA, consists of 22,430 question-answer pairs across 607 English documents.
arXiv Detail & Related papers (2021-11-01T04:50:26Z)
ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers [93.55268936974971]
We describe a Question Answering dataset that contains complex questions with conditional answers. We call this dataset ConditionalQA. We show that ConditionalQA is challenging for many of the existing QA models, especially in selecting answer conditions.
arXiv Detail & Related papers (2021-10-13T17:16:46Z)
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers. Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text. We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z)
Towards Data Distillation for End-to-end Spoken Conversational Question Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA) SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora. Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z)
Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document. We show that readers engage in a series of pragmatic strategies to seek information. We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
Question Rewriting for Conversational Question Answering [15.355557454305776]
We introduce a conversational QA architecture that sets the new state of the art on the TREC CAsT 2019 passage retrieval dataset. We show that the same QR model improves QA performance on the QuAC dataset with respect to answer span extraction. Our evaluation results indicate that the QR model achieves near human-level performance on both datasets.
arXiv Detail & Related papers (2020-04-30T09:27:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.