EmailSum: Abstractive Email Thread Summarization
- URL: http://arxiv.org/abs/2107.14691v1
- Date: Fri, 30 Jul 2021 15:13:14 GMT
- Title: EmailSum: Abstractive Email Thread Summarization
- Authors: Shiyue Zhang, Asli Celikyilmaz, Jianfeng Gao, Mohit Bansal
- Abstract summary: We develop an abstractive Email Thread Summarization (EmailSum) dataset.
This dataset contains human-annotated short (30 words) and long (100 words) summaries of 2549 email threads.
Our results reveal the key challenges of current abstractive summarization models in this task.
- Score: 105.46012304024312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have brought about an interest in the challenging task of
summarizing conversation threads (meetings, online discussions, etc.). Such
summaries help analysis of the long text to quickly catch up with the decisions
made and thus improve our work or communication efficiency. To spur research in
thread summarization, we have developed an abstractive Email Thread
Summarization (EmailSum) dataset, which contains human-annotated short (<30
words) and long (<100 words) summaries of 2549 email threads (each containing 3
to 10 emails) over a wide variety of topics. We perform a comprehensive
empirical study to explore different summarization techniques (including
extractive and abstractive methods, single-document and hierarchical models, as
well as transfer and semisupervised learning) and conduct human evaluations on
both short and long summary generation tasks. Our results reveal the key
challenges of current abstractive summarization models in this task, such as
understanding the sender's intent and identifying the roles of sender and
receiver. Furthermore, we find that widely used automatic evaluation metrics
(ROUGE, BERTScore) are weakly correlated with human judgments on this email
thread summarization task. Hence, we emphasize the importance of human
evaluation and the development of better metrics by the community. Our code and
summary data have been made available at:
https://github.com/ZhangShiyue/EmailSum
Related papers
- On Context Utilization in Summarization with Large Language Models [83.84459732796302]
Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries.
Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens.
We conduct the first comprehensive study on context utilization and position bias in summarization.
arXiv Detail & Related papers (2023-10-16T16:45:12Z) - Summarization with Graphical Elements [55.5913491389047]
We propose a new task: summarization with graphical elements.
We collect a high quality human labeled dataset to support research into the task.
arXiv Detail & Related papers (2022-04-15T17:16:41Z) - Automatic Text Summarization Methods: A Comprehensive Review [1.6114012813668934]
This study provides a detailed analysis of text summarization concepts such as summarization approaches, techniques used, standard datasets, evaluation metrics and future scopes for research.
arXiv Detail & Related papers (2022-03-03T10:45:00Z) - AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer
Summarization [73.91543616777064]
Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions.
One goal of answer summarization is to produce a summary that reflects the range of answer perspectives.
This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists.
arXiv Detail & Related papers (2021-11-11T21:48:02Z) - Neural Abstractive Unsupervised Summarization of Online News Discussions [1.2617078020344619]
We introduce a novel method that generates abstractive summaries of online news discussions.
Our model is evaluated using ROUGE scores between the generated summary and each comment on the thread.
arXiv Detail & Related papers (2021-06-07T20:33:51Z) - Summaformers @ LaySumm 20, LongSumm 20 [14.44754831438127]
In this paper, we look at the problem of summarizing scientific research papers from multiple domains.
We differentiate between two types of summaries, namely, LaySumm and LongSumm.
While leveraging latest Transformer-based models, our systems are simple, intuitive and based on how specific paper sections contribute to human summaries.
arXiv Detail & Related papers (2021-01-10T13:48:12Z) - Abstractive Summarization of Spoken and Written Instructions with BERT [66.14755043607776]
We present the first application of the BERTSum model to conversational language.
We generate abstractive summaries of narrated instructional videos across a wide variety of topics.
We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.
arXiv Detail & Related papers (2020-08-21T20:59:34Z) - From Standard Summarization to New Tasks and Beyond: Summarization with
Manifold Information [77.89755281215079]
Text summarization is the research area aiming at creating a short and condensed version of the original document.
In real-world applications, most of the data is not in a plain text format.
This paper focuses on the survey of these new summarization tasks and approaches in the real-world application.
arXiv Detail & Related papers (2020-05-10T14:59:36Z) - Intweetive Text Summarization [1.1565654851982567]
We propose to automatically generated summaries of Micro-Blogs conversations dealing with public figures E-Reputation.
These summaries are generated using key-word queries or sample tweet and offer a focused view of the whole Micro-Blog network.
arXiv Detail & Related papers (2020-01-16T08:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.