Question-aware Transformer Models for Consumer Health Question
Summarization
- URL: http://arxiv.org/abs/2106.00219v1
- Date: Tue, 1 Jun 2021 04:21:31 GMT
- Title: Question-aware Transformer Models for Consumer Health Question
Summarization
- Authors: Shweta Yadav, Deepak Gupta, Asma Ben Abacha and Dina Demner-Fushman
- Abstract summary: We develop an abstractive question summarization model that leverages the semantic interpretation of a question via recognition of medical entities.
When evaluated on the MeQSum benchmark corpus, our framework outperformed the state-of-the-art method by 10.2 ROUGE-L points.
- Score: 20.342580435464072
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Searching for health information online is becoming customary for more and
more consumers every day, which makes the need for efficient and reliable
question answering systems more pressing. An important contributor to the
success rates of these systems is their ability to fully understand the
consumers' questions. However, these questions are frequently longer than
needed and mention peripheral information that is not useful in finding
relevant answers. Question summarization is one of the potential solutions to
simplifying long and complex consumer questions before attempting to find an
answer. In this paper, we study the task of abstractive summarization for
real-world consumer health questions. We develop an abstractive question
summarization model that leverages the semantic interpretation of a question
via recognition of medical entities, which enables the generation of
informative summaries. Towards this, we propose multiple Cloze tasks (i.e. the
task of filing missing words in a given context) to identify the key medical
entities that enforce the model to have better coverage in question-focus
recognition. Additionally, we infuse the decoder inputs with question-type
information to generate question-type driven summaries. When evaluated on the
MeQSum benchmark corpus, our framework outperformed the state-of-the-art method
by 10.2 ROUGE-L points. We also conducted a manual evaluation to assess the
correctness of the generated summaries.
Related papers
- Aspect-oriented Consumer Health Answer Summarization [2.298110639419913]
Community Question-Answering (CQA) forums have revolutionized how people seek information, especially those related to their healthcare needs.
There can be several answers in response to a single query, which makes it hard to grasp the key information related to the specific health concern.
Our research focuses on aspect-based summarization of health answers to address this limitation.
arXiv Detail & Related papers (2024-05-10T07:52:43Z) - InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification [60.10193972862099]
This work proposes a framework to characterize and recover simplification-induced information loss in form of question-and-answer pairs.
QA pairs are designed to help readers deepen their knowledge of a text.
arXiv Detail & Related papers (2024-01-29T19:00:01Z) - Medical Question Summarization with Entity-driven Contrastive Learning [12.008269098530386]
This paper proposes a novel medical question summarization framework using entity-driven contrastive learning (ECL)
ECL employs medical entities in frequently asked questions (FAQs) as focuses and devises an effective mechanism to generate hard negative samples.
We find that some MQA datasets suffer from serious data leakage problems, such as the iCliniq dataset's 33% duplicate rate.
arXiv Detail & Related papers (2023-04-15T00:19:03Z) - Medical Question Understanding and Answering with Knowledge Grounding
and Semantic Self-Supervision [53.692793122749414]
We introduce a medical question understanding and answering system with knowledge grounding and semantic self-supervision.
Our system is a pipeline that first summarizes a long, medical, user-written question, using a supervised summarization loss.
The system first matches the summarized user question with an FAQ from a trusted medical knowledge base, and then retrieves a fixed number of relevant sentences from the corresponding answer document.
arXiv Detail & Related papers (2022-09-30T08:20:32Z) - CHQ-Summ: A Dataset for Consumer Healthcare Question Summarization [21.331145794496774]
We introduce a new dataset, CHQ-Summ, that contains 1507 domain-expert annotated consumer health questions and corresponding summaries.
The dataset is derived from the community question-answering forum.
We benchmark the dataset on multiple state-of-the-art summarization models to show the effectiveness of the dataset.
arXiv Detail & Related papers (2022-06-14T03:49:03Z) - AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer
Summarization [73.91543616777064]
Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions.
One goal of answer summarization is to produce a summary that reflects the range of answer perspectives.
This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists.
arXiv Detail & Related papers (2021-11-11T21:48:02Z) - Reinforcement Learning for Abstractive Question Summarization with
Question-aware Semantic Rewards [20.342580435464072]
We introduce a reinforcement learning-based framework for abstractive question summarization.
We propose two novel rewards obtained from the downstream tasks of (i) question-type identification and (ii) question-focus recognition.
These rewards ensure the generation of semantically valid questions and encourage the inclusion of key medical entities/foci in the question summary.
arXiv Detail & Related papers (2021-07-01T02:06:46Z) - A Dataset of Information-Seeking Questions and Answers Anchored in
Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers.
Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text.
We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z) - What Makes a Good Summary? Reconsidering the Focus of Automatic
Summarization [49.600619575148706]
We find that the current focus of the field does not fully align with participants' wishes.
Based on our findings, we argue that it is important to adopt a broader perspective on automatic summarization.
arXiv Detail & Related papers (2020-12-14T15:12:35Z) - Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex
Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam.
These questions are the most challenging for current QA systems.
We present a Multi-step reasoning with Knowledge extraction framework (MurKe)
We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z) - Question-Driven Summarization of Answers to Consumer Health Questions [17.732729654047983]
We present the MEDIQA Answer Summarization dataset.
This dataset is the first summarization collection containing question-driven summaries of answers to consumer health questions.
We include results of baseline and state-of-the-art deep learning summarization models.
arXiv Detail & Related papers (2020-05-18T20:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.