Reconstruct Before Summarize: An Efficient Two-Step Framework for
Condensing and Summarizing Meeting Transcripts
- URL: http://arxiv.org/abs/2305.07988v2
- Date: Sun, 22 Oct 2023 17:42:44 GMT
- Title: Reconstruct Before Summarize: An Efficient Two-Step Framework for
Condensing and Summarizing Meeting Transcripts
- Authors: Haochen Tan, Han Wu, Wei Shao, Xinyun Zhang, Mingjie Zhan, Zhaohui
Hou, Ding Liang, Linqi Song
- Abstract summary: We propose a two-step framework, Reconstruct before Summarize (RbS), for effective and efficient meeting summarization.
RbS first leverages a self-supervised paradigm to annotate essential contents by reconstructing the meeting transcripts.
Secondly, we propose a relative positional bucketing (RPB) algorithm to equip (conventional) summarization models to generate the summary.
- Score: 32.329723001930006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meetings typically involve multiple participants and lengthy conversations,
resulting in redundant and trivial content. To overcome these challenges, we
propose a two-step framework, Reconstruct before Summarize (RbS), for effective
and efficient meeting summarization. RbS first leverages a self-supervised
paradigm to annotate essential contents by reconstructing the meeting
transcripts. Secondly, we propose a relative positional bucketing (RPB)
algorithm to equip (conventional) summarization models to generate the summary.
Despite the additional reconstruction process, our proposed RPB significantly
compressed the input, leading to faster processing and reduced memory
consumption compared to traditional summarization methods. We validate the
effectiveness and efficiency of our method through extensive evaluations and
analysis. On two meeting summarization datasets, AMI and ICSI, our approach
outperforms previous state-of-the-art approaches without relying on large-scale
pre-training or expert-grade annotating tools.
Related papers
- Recycled Attention: Efficient inference for long-context language models [54.00118604124301]
We propose Recycled Attention, an inference-time method which alternates between full context attention and attention over a subset of input tokens.
When performing partial attention, we recycle the attention pattern of a previous token that has performed full attention and attend only to the top K most attended tokens.
Compared to previously proposed inference-time acceleration method which attends only to local context or tokens with high accumulative attention scores, our approach flexibly chooses tokens that are relevant to the current decoding step.
arXiv Detail & Related papers (2024-11-08T18:57:07Z) - Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Action-Item-Driven Summarization of Long Meeting Transcripts [8.430481660019451]
This paper introduces a novel and effective approach to automate the generation of meeting summaries.
Our novel algorithms can generate abstractive meeting summaries that are driven by the action items contained in the meeting transcript.
Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approximately 4.98% increase from the current state-of-the-art result.
arXiv Detail & Related papers (2023-12-29T12:33:21Z) - Aspect-based Meeting Transcript Summarization: A Two-Stage Approach with
Weak Supervision on Sentence Classification [91.13086984529706]
Aspect-based meeting transcript summarization aims to produce multiple summaries.
Traditional summarization methods produce one summary mixing information of all aspects.
We propose a two-stage method for aspect-based meeting transcript summarization.
arXiv Detail & Related papers (2023-11-07T19:06:31Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - Meeting Summarization with Pre-training and Clustering Methods [6.47783315109491]
HMNetcitehmnet is a hierarchical network that employs both a word-level transformer and a turn-level transformer, as the baseline.
We extend the locate-then-summarize approach of QMSumciteqmsum with an intermediate clustering step.
We compare the performance of our baseline models with BART, a state-of-the-art language model that is effective for summarization.
arXiv Detail & Related papers (2021-11-16T03:14:40Z) - Attention-based Multi-hypothesis Fusion for Speech Summarization [83.04957603852571]
Speech summarization can be achieved by combining automatic speech recognition (ASR) and text summarization (TS)
ASR errors directly affect the quality of the output summary in the cascade approach.
We propose a cascade speech summarization model that is robust to ASR errors and that exploits multiple hypotheses generated by ASR to attenuate the effect of ASR errors on the summary.
arXiv Detail & Related papers (2021-11-16T03:00:29Z) - Unsupervised Topic Segmentation of Meetings with BERT Embeddings [57.91018542715725]
We show how previous unsupervised topic segmentation methods can be improved using pre-trained neural architectures.
We introduce an unsupervised approach based on BERT embeddings that achieves a 15.5% reduction in error rate over existing unsupervised approaches.
arXiv Detail & Related papers (2021-06-24T12:54:43Z) - A Divide-and-Conquer Approach to the Summarization of Long Documents [4.863209463405628]
We present a novel divide-and-conquer method for the neural summarization of long documents.
Our method exploits the discourse structure of the document and uses sentence similarity to split the problem into smaller summarization problems.
We demonstrate that this approach paired with different summarization models, including sequence-to-sequence RNNs and Transformers, can lead to improved summarization performance.
arXiv Detail & Related papers (2020-04-13T20:38:49Z) - A Hierarchical Network for Abstractive Meeting Summarization with
Cross-Domain Pretraining [52.11221075687124]
We propose a novel abstractive summary network that adapts to the meeting scenario.
We design a hierarchical structure to accommodate long meeting transcripts and a role vector to depict the difference among speakers.
Our model outperforms previous approaches in both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-04-04T21:00:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.