Nutribullets Hybrid: Multi-document Health Summarization
- URL: http://arxiv.org/abs/2104.03465v1
- Date: Thu, 8 Apr 2021 01:44:29 GMT
- Title: Nutribullets Hybrid: Multi-document Health Summarization
- Authors: Darsh J Shah, Lili Yu, Tao Lei and Regina Barzilay
- Abstract summary: We present a method for generating comparative summaries that highlights similarities and contradictions in input documents.
Our framework leads to more faithful, relevant and aggregation-sensitive summarization -- while being equally fluent.
- Score: 36.95954983680022
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a method for generating comparative summaries that highlights
similarities and contradictions in input documents. The key challenge in
creating such summaries is the lack of large parallel training data required
for training typical summarization systems. To this end, we introduce a hybrid
generation approach inspired by traditional concept-to-text systems. To enable
accurate comparison between different sources, the model first learns to
extract pertinent relations from input documents. The content planning
component uses deterministic operators to aggregate these relations after
identifying a subset for inclusion into a summary. The surface realization
component lexicalizes this information using a text-infilling language model.
By separately modeling content selection and realization, we can effectively
train them with limited annotations. We implemented and tested the model in the
domain of nutrition and health -- rife with inconsistencies. Compared to
conventional methods, our framework leads to more faithful, relevant and
aggregation-sensitive summarization -- while being equally fluent.
Related papers
- Personalized Video Summarization using Text-Based Queries and Conditional Modeling [3.4447129363520337]
This thesis explores enhancing video summarization by integrating text-based queries and conditional modeling.
Evaluation metrics such as accuracy and F1-score assess the quality of the generated summaries.
arXiv Detail & Related papers (2024-08-27T02:43:40Z) - Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Bridging Benchmark Dataset [1.825224193230824]
We describe a novel, collaborative, and iterative annotator-in-the-loop methodology for annotation.
Our findings indicate that collaborative engagement with annotators can enhance annotation methods.
arXiv Detail & Related papers (2024-08-01T19:11:08Z) - LBMT team at VLSP2022-Abmusu: Hybrid method with text correlation and
generative models for Vietnamese multi-document summarization [1.4716144941085147]
This paper proposes a method for multi-document summarization based on cluster similarity.
After generating summaries by selecting the most important sentences from each cluster, we apply BARTpho and ViT5 to construct the abstractive models.
arXiv Detail & Related papers (2023-04-11T13:15:24Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Masked Summarization to Generate Factually Inconsistent Summaries for
Improved Factual Consistency Checking [28.66287193703365]
We propose to generate factually inconsistent summaries using source texts and reference summaries with key information masked.
Experiments on seven benchmark datasets demonstrate that factual consistency classifiers trained on summaries generated using our method generally outperform existing models.
arXiv Detail & Related papers (2022-05-04T12:48:49Z) - Relation Clustering in Narrative Knowledge Graphs [71.98234178455398]
relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations.
Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.
arXiv Detail & Related papers (2020-11-27T10:43:04Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.