Learning to Summarize Long Texts with Memory Compression and Transfer
- URL: http://arxiv.org/abs/2010.11322v1
- Date: Wed, 21 Oct 2020 21:45:44 GMT
- Title: Learning to Summarize Long Texts with Memory Compression and Transfer
- Authors: Jaehong Park, Jonathan Pilault and Christopher Pal
- Abstract summary: We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent neural network based encoder decoder architectures.
Our memory regularization compresses an encoded input article into a more compact set of sentence representations.
- Score: 3.5407857489235206
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent
neural network based encoder decoder architectures and we explore its use for
abstractive document summarization. Mem2Mem transfers "memories" via
readable/writable external memory modules that augment both the encoder and
decoder. Our memory regularization compresses an encoded input article into a
more compact set of sentence representations. Most importantly, the memory
compression step performs implicit extraction without labels, sidestepping
issues with suboptimal ground-truth data and exposure bias of hybrid
extractive-abstractive summarization techniques. By allowing the decoder to
read/write over the encoded input memory, the model learns to read salient
information about the input article while keeping track of what has been
generated. Our Mem2Mem approach yields results that are competitive with state
of the art transformer based summarization methods, but with 16 times fewer
parameters
Related papers
- Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation [13.310412868082832]
Large-scale text encoders in text-to-image (T2I) diffusion models have demonstrated exceptional performance.
Despite their minimal contribution to total inference time and floating-point operations (FLOPs), text encoders demand significantly higher memory usage.
We propose Skip and Re-use layers (Skrr), a simple yet effective pruning strategy specifically designed for text encoders in T2I diffusion models.
arXiv Detail & Related papers (2025-02-12T15:03:26Z) - On Memory Construction and Retrieval for Personalized Conversational Agents [69.46887405020186]
We propose SeCom, a method that constructs a memory bank with topical segments by introducing a conversation model, while performing memory retrieval based on compressed memory units.
Experimental results show that SeCom outperforms turn-level, session-level, and several summarization-based methods on long-term conversation benchmarks such as LOCOMO and Long-MT-Bench+.
arXiv Detail & Related papers (2025-02-08T14:28:36Z) - BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments [53.71158537264695]
Large language models (LLMs) have revolutionized numerous applications, yet their deployment remains challenged by memory constraints on local devices.
We introduce textbfBitStack, a novel, training-free weight compression approach that enables megabyte-level trade-offs between memory usage and model performance.
arXiv Detail & Related papers (2024-10-31T13:26:11Z) - Context Compression for Auto-regressive Transformers with Sentinel
Tokens [37.07722536907739]
We propose a plug-and-play approach that is able to incrementally compress the intermediate activation of a specified span of tokens into compact ones.
Experiments on both in-domain language modeling and zero-shot open-ended document generation demonstrate the advantage of our approach.
arXiv Detail & Related papers (2023-10-12T09:18:19Z) - Implicit Memory Transformer for Computationally Efficient Simultaneous
Speech Translation [0.20305676256390928]
We propose an Implicit Memory Transformer that implicitly retains memory through a new left context method.
Experiments on the MuST-C dataset show that the Implicit Memory Transformer provides a substantial speedup on the encoder forward pass.
arXiv Detail & Related papers (2023-07-03T22:20:21Z) - GLIMMER: generalized late-interaction memory reranker [29.434777627686692]
Memory-augmentation is a powerful approach for incorporating external information into language models.
Recent work introduced LUMEN, a memory-retrieval hybrid that partially pre-computes memory and updates memory representations on the fly with a smaller live encoder.
We propose GLIMMER, which improves on this approach through 1) exploiting free access to the powerful memory representations by applying a shallow reranker on top of memory to drastically improve retrieval quality at low cost.
arXiv Detail & Related papers (2023-06-17T01:54:25Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - Text Compression-aided Transformer Encoding [77.16960983003271]
We propose explicit and implicit text compression approaches to enhance the Transformer encoding.
backbone information, meaning the gist of the input text, is not specifically focused on.
Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.
arXiv Detail & Related papers (2021-02-11T11:28:39Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z) - MART: Memory-Augmented Recurrent Transformer for Coherent Video
Paragraph Captioning [128.36951818335046]
We propose a new approach called Memory-Augmented Recurrent Transformer (MART)
MART uses a memory module to augment the transformer architecture.
MART generates more coherent and less repetitive paragraph captions than baseline methods.
arXiv Detail & Related papers (2020-05-11T20:01:41Z) - Learning Directly from Grammar Compressed Text [17.91878224879985]
We propose a method to apply neural sequence models to text data compressed with grammar compression algorithms without decompression.
To encode the unique symbols that appear in compression rules, we introduce composer modules to incrementally encode the symbols into vector representations.
arXiv Detail & Related papers (2020-02-28T06:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.