MUG: A General Meeting Understanding and Generation Benchmark
- URL: http://arxiv.org/abs/2303.13939v2
- Date: Mon, 27 Mar 2023 03:51:52 GMT
- Title: MUG: A General Meeting Understanding and Generation Benchmark
- Authors: Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang,
Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao
- Abstract summary: We build the AliMeeting4MUG Corpus, which consists of 654 recorded Mandarin meeting sessions with diverse topic coverage.
In this paper, we provide a detailed introduction of this corpus, SLP tasks and evaluation methods, baseline systems and their performance.
- Score: 60.09540662936726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Listening to long video/audio recordings from video conferencing and online
courses for acquiring information is extremely inefficient. Even after ASR
systems transcribe recordings into long-form spoken language documents, reading
ASR transcripts only partly speeds up seeking information. It has been observed
that a range of NLP applications, such as keyphrase extraction, topic
segmentation, and summarization, significantly improve users' efficiency in
grasping important information. The meeting scenario is among the most valuable
scenarios for deploying these spoken language processing (SLP) capabilities.
However, the lack of large-scale public meeting datasets annotated for these
SLP tasks severely hinders their advancement. To prompt SLP advancement, we
establish a large-scale general Meeting Understanding and Generation Benchmark
(MUG) to benchmark the performance of a wide range of SLP tasks, including
topic segmentation, topic-level and session-level extractive summarization and
topic title generation, keyphrase extraction, and action item detection. To
facilitate the MUG benchmark, we construct and release a large-scale meeting
dataset for comprehensive long-form SLP development, the AliMeeting4MUG Corpus,
which consists of 654 recorded Mandarin meeting sessions with diverse topic
coverage, with manual annotations for SLP tasks on manual transcripts of
meeting recordings. To the best of our knowledge, the AliMeeting4MUG Corpus is
so far the largest meeting corpus in scale and facilitates most SLP tasks. In
this paper, we provide a detailed introduction of this corpus, SLP tasks and
evaluation methods, baseline systems and their performance.
Related papers
- An End-to-End Speech Summarization Using Large Language Model [7.562198375754054]
Speech Summarization (SSum) aims to generate human-like text summaries from spoken content.
Research on large language models (LLMs) and multimodal information fusion has provided new insights.
We propose an end-to-end SSum model that utilizes Q-Former as a connector for the audio-text modality.
arXiv Detail & Related papers (2024-07-02T07:22:57Z) - TreeSeg: Hierarchical Topic Segmentation of Large Transcripts [0.0]
We present TreeSeg, an approach that combines off-the-shelf embedding models with divisive clustering, to generate hierarchical, structured segmentations of transcripts in the form of binary trees.
We evaluate TreeSeg on the ICSI and AMI corpora, demonstrating that it outperforms all baselines.
Finally, we introduce TinyRec, a small-scale corpus of manually annotated transcripts, obtained from self-recorded video sessions.
arXiv Detail & Related papers (2024-06-28T23:49:26Z) - DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding [51.32965203977845]
We propose the use of discrete speech units (DSU) instead of continuous-valued speech encoder outputs.
The proposed model shows robust performance on speech inputs from seen/unseen domains and instruction-following capability in spoken question answering.
Our findings suggest that the ASR task and datasets are not crucial in instruction-tuning for spoken question answering tasks.
arXiv Detail & Related papers (2024-06-13T17:28:13Z) - Investigating Consistency in Query-Based Meeting Summarization: A
Comparative Study of Different Embedding Methods [0.0]
Text Summarization is one of famous applications in Natural Language Processing (NLP) field.
It aims to automatically generate summary with important information based on a given context.
In this paper, we are inspired by "QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization" proposed by Microsoft.
We also propose our Locater model designed to extract relevant spans based on given transcript and query, which are then summarized by Summarizer model.
arXiv Detail & Related papers (2024-02-10T08:25:30Z) - Overview of the ICASSP 2023 General Meeting Understanding and Generation
Challenge (MUG) [60.09540662936726]
MUG includes five tracks, including topic segmentation, topic-level and session-level extractive summarization, topic title generation, keyphrase extraction, and action item detection.
To facilitate MUG, we construct and release a large-scale meeting dataset, the AliMeeting4MUG Corpus.
arXiv Detail & Related papers (2023-03-24T11:42:19Z) - SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding
Tasks [88.4408774253634]
Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community.
There are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers.
Recent work has begun to introduce such benchmark for several tasks.
arXiv Detail & Related papers (2022-12-20T18:39:59Z) - WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
Processing [102.45426364965887]
We propose a new pre-trained model, WavLM, to solve full-stack downstream speech tasks.
WavLM is built based on the HuBERT framework, with an emphasis on both spoken content modeling and speaker identity preservation.
We scale up the training dataset from 60k hours to 94k hours of public audio data, and optimize its training procedure for better representation extraction.
arXiv Detail & Related papers (2021-10-26T17:55:19Z) - A Sliding-Window Approach to Automatic Creation of Meeting Minutes [66.39584679676817]
Meeting minutes record any subject matters discussed, decisions reached and actions taken at meetings.
We present a sliding window approach to automatic generation of meeting minutes.
It aims to tackle issues associated with the nature of spoken text, including lengthy transcripts and lack of document structure.
arXiv Detail & Related papers (2021-04-26T02:44:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.