DialSumm: A Real-Life Scenario Dialogue Summarization Dataset
- URL: http://arxiv.org/abs/2105.06762v1
- Date: Fri, 14 May 2021 11:12:40 GMT
- Title: DialSumm: A Real-Life Scenario Dialogue Summarization Dataset
- Authors: Yulong Chen, Yang Liu, Liang Chen and Yue Zhang
- Abstract summary: We propose DialSumm, a large-scale labeled dialogue summarization dataset.
We conduct empirical analysis on DialSumm using state-of-the-art neural summarizers.
- Score: 16.799104478351914
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proposal of large-scale datasets has facilitated research on deep neural
models for news summarization. Deep learning can also be potentially useful for
spoken dialogue summarization, which can benefit a range of real-life scenarios
including customer service management and medication tracking. To this end, we
propose DialSumm, a large-scale labeled dialogue summarization dataset. We
conduct empirical analysis on DialSumm using state-of-the-art neural
summarizers. Experimental results show unique challenges in dialogue
summarization, such as spoken terms, special discourse structures, coreferences
and ellipsis, pragmatics and social commonsense, which require specific
representation learning technologies to better deal with.
Related papers
- Neural-Bayesian Program Learning for Few-shot Dialogue Intent Parsing [14.90367428035125]
We propose a novel Neural-Bayesian Learning model named Dialogue-Intentesian Program (DI-)
DI- specializes in intent parsing under data-hungry settings and offers promising performance improvements.
Experimental results demonstrate that DI- outperforms state-of-the-art deep learning models and offers practical advantages for industrial-scale applications.
arXiv Detail & Related papers (2024-10-08T16:54:00Z) - Increasing faithfulness in human-human dialog summarization with Spoken Language Understanding tasks [0.0]
We propose an exploration of how incorporating task-related information can enhance the summarization process.
Results show that integrating models with task-related information improves summary accuracy, even with varying word error rates.
arXiv Detail & Related papers (2024-09-16T08:15:35Z) - FREDSum: A Dialogue Summarization Corpus for French Political Debates [26.76383031532945]
We present a dataset of French political debates for the purpose of enhancing resources for multi-lingual dialogue summarization.
Our dataset consists of manually transcribed and annotated political debates, covering a range of topics and perspectives.
arXiv Detail & Related papers (2023-12-08T05:42:04Z) - AUGUST: an Automatic Generation Understudy for Synthesizing
Conversational Recommendation Datasets [56.052803235932686]
We propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues.
In doing so, we exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets.
arXiv Detail & Related papers (2023-06-16T05:27:14Z) - DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z) - Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches
and Future Directions [14.85592662663867]
This survey provides a comprehensive investigation on existing work for abstractive dialogue summarization from scenarios.
It categorizes the task into two broad categories according to the type of input dialogues, i.e., open-domain and task-oriented.
It presents a taxonomy of existing techniques in three directions, namely, injecting dialogue features, designing auxiliary training tasks and using additional data.
arXiv Detail & Related papers (2022-10-18T14:33:03Z) - HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on
Tabular and Textual Data [87.67278915655712]
We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables.
The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions.
arXiv Detail & Related papers (2022-04-28T00:52:16Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Topic-Oriented Spoken Dialogue Summarization for Customer Service with
Saliency-Aware Topic Modeling [61.67321200994117]
In a customer service system, dialogue summarization can boost service efficiency by creating summaries for long spoken dialogues.
In this work, we focus on topic-oriented dialogue summarization, which generates highly abstractive summaries.
We propose a novel topic-augmented two-stage dialogue summarizer ( TDS) jointly with a saliency-aware neural topic model (SATM) for topic-oriented summarization of customer service dialogues.
arXiv Detail & Related papers (2020-12-14T07:50:25Z) - Incorporating Commonsense Knowledge into Abstractive Dialogue
Summarization via Heterogeneous Graph Networks [34.958271247099]
We present a novel multi-speaker dialogue summarizer to demonstrate how large-scale commonsense knowledge can facilitate dialogue understanding and summary generation.
We consider utterance and commonsense knowledge as two different types of data and design a Dialogue Heterogeneous Graph Network (D-HGN) for modeling both information.
arXiv Detail & Related papers (2020-10-20T05:44:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.