How Domain Terminology Affects Meeting Summarization Performance
- URL: http://arxiv.org/abs/2011.00692v2
- Date: Mon, 9 Nov 2020 01:34:23 GMT
- Title: How Domain Terminology Affects Meeting Summarization Performance
- Authors: Jia Jin Koay and Alexander Roustai and Xiaojin Dai and Dillon Burns
and Alec Kerrigan and Fei Liu
- Abstract summary: We create gold-standard annotations for domain terminology on a sizable meeting corpus.
We analyze the performance of a meeting summarization system with and without jargon terms.
- Score: 61.12624289478716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meetings are essential to modern organizations. Numerous meetings are held
and recorded daily, more than can ever be comprehended. A meeting summarization
system that identifies salient utterances from the transcripts to automatically
generate meeting minutes can help. It empowers users to rapidly search and sift
through large meeting collections. To date, the impact of domain terminology on
the performance of meeting summarization remains understudied, despite that
meetings are rich with domain knowledge. In this paper, we create gold-standard
annotations for domain terminology on a sizable meeting corpus; they are known
as jargon terms. We then analyze the performance of a meeting summarization
system with and without jargon terms. Our findings reveal that domain
terminology can have a substantial impact on summarization performance. We
publicly release all domain terminology to advance research in meeting
summarization.
Related papers
- Investigating Consistency in Query-Based Meeting Summarization: A
Comparative Study of Different Embedding Methods [0.0]
Text Summarization is one of famous applications in Natural Language Processing (NLP) field.
It aims to automatically generate summary with important information based on a given context.
In this paper, we are inspired by "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization" proposed by Microsoft.
We also propose our Locater model designed to extract relevant spans based on given transcript and query, which are then summarized by Summarizer model.
arXiv Detail & Related papers (2024-02-10T08:25:30Z) - Improving Query-Focused Meeting Summarization with Query-Relevant
Knowledge [71.14873115781366]
We propose a knowledge-enhanced two-stage framework called Knowledge-Aware Summarizer (KAS) to tackle the challenges.
In the first stage, we introduce knowledge-aware scores to improve the query-relevant segment extraction.
In the second stage, we incorporate query-relevant knowledge in the summary generation.
arXiv Detail & Related papers (2023-09-05T10:26:02Z) - Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system [30.35387091657807]
Large language models (LLMs) for dialog summarization have the potential to improve the experience of meetings.
Despite this potential, they face technological limitation due to long transcripts and inability to capture diverse recap needs based on user's context.
We develop a system to operationalize the representations with dialogue summarization as its building blocks.
arXiv Detail & Related papers (2023-07-28T20:25:11Z) - MeetingBank: A Benchmark Dataset for Meeting Summarization [37.761684754365945]
In this paper, we present MeetingBank, a new benchmark dataset of city council meetings over the past decade.
We make the collection, including meeting video links, transcripts, reference summaries, agenda, and other metadata, publicly available to facilitate the development of better meeting summarization techniques.
arXiv Detail & Related papers (2023-05-27T17:09:25Z) - MUG: A General Meeting Understanding and Generation Benchmark [60.09540662936726]
We build the AliMeeting4MUG Corpus, which consists of 654 recorded Mandarin meeting sessions with diverse topic coverage.
In this paper, we provide a detailed introduction of this corpus, SLP tasks and evaluation methods, baseline systems and their performance.
arXiv Detail & Related papers (2023-03-24T11:52:25Z) - Overview of the ICASSP 2023 General Meeting Understanding and Generation
Challenge (MUG) [60.09540662936726]
MUG includes five tracks, including topic segmentation, topic-level and session-level extractive summarization, topic title generation, keyphrase extraction, and action item detection.
To facilitate MUG, we construct and release a large-scale meeting dataset, the AliMeeting4MUG Corpus.
arXiv Detail & Related papers (2023-03-24T11:42:19Z) - A Sliding-Window Approach to Automatic Creation of Meeting Minutes [66.39584679676817]
Meeting minutes record any subject matters discussed, decisions reached and actions taken at meetings.
We present a sliding window approach to automatic generation of meeting minutes.
It aims to tackle issues associated with the nature of spoken text, including lengthy transcripts and lack of document structure.
arXiv Detail & Related papers (2021-04-26T02:44:14Z) - QMSum: A New Benchmark for Query-based Multi-domain Meeting
Summarization [45.83402681068943]
QMSum consists of 1,808 query-summary pairs over 232 meetings in multiple domains.
We investigate a locate-then-summarize method and evaluate a set of strong summarization baselines on the task.
arXiv Detail & Related papers (2021-04-13T05:00:35Z) - A Hierarchical Network for Abstractive Meeting Summarization with
Cross-Domain Pretraining [52.11221075687124]
We propose a novel abstractive summary network that adapts to the meeting scenario.
We design a hierarchical structure to accommodate long meeting transcripts and a role vector to depict the difference among speakers.
Our model outperforms previous approaches in both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-04-04T21:00:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.