StreamingDialogue: Prolonged Dialogue Learning via Long Context
Compression with Minimal Losses
- URL: http://arxiv.org/abs/2403.08312v1
- Date: Wed, 13 Mar 2024 07:44:14 GMT
- Title: StreamingDialogue: Prolonged Dialogue Learning via Long Context
Compression with Minimal Losses
- Authors: Jia-Nan Li, Quan Tu, Cunli Mao, Zhengtao Yu, Ji-Rong Wen, Rui Yan
- Abstract summary: StreamingDialogue compresses long dialogue history into conv-attn sinks with minimal losses.
Our method achieves a 4 $times$ speedup while reducing memory usage by 18 $times$ compared to dense attention recomputation.
- Score: 71.97541246814818
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Standard Large Language Models (LLMs) struggle with handling dialogues with
long contexts due to efficiency and consistency issues. According to our
observation, dialogue contexts are highly structured, and the special token of
\textit{End-of-Utterance} (EoU) in dialogues has the potential to aggregate
information. We refer to the EoU tokens as ``conversational attention sinks''
(conv-attn sinks). Accordingly, we introduce StreamingDialogue, which
compresses long dialogue history into conv-attn sinks with minimal losses, and
thus reduces computational complexity quadratically with the number of sinks
(i.e., the number of utterances). Current LLMs already demonstrate the ability
to handle long context window, e.g., a window size of 200k or more. To this
end, by compressing utterances into EoUs, our method has the potential to
handle more than 200k of utterances, resulting in a prolonged dialogue
learning. In order to minimize information losses from reconstruction after
compression, we design two learning strategies of short-memory reconstruction
(SMR) and long-memory reactivation (LMR). Our method outperforms strong
baselines in dialogue tasks and achieves a 4 $\times$ speedup while reducing
memory usage by 18 $\times$ compared to dense attention recomputation.
Related papers
- Recurrent Context Compression: Efficiently Expanding the Context Window of LLM [22.595457889113668]
This work introduces a method called Recurrent Context Compression (RCC), designed to efficiently expand the context window length of Transformer-based large language models (LLMs)
We validated our approach on multiple tasks, achieving a compression rate of up to 32x on text reconstruction tasks with a BLEU4 score close to 0.95, and nearly 100% accuracy on a passkey retrieval task with a sequence length of 1M.
arXiv Detail & Related papers (2024-06-10T08:50:59Z) - LLoCO: Learning Long Contexts Offline [63.3458260335454]
We introduce LLoCO, a technique that combines context compression, retrieval, and parameter-efficient finetuning using LoRA.
We evaluate our approach on several long-context question-answering datasets, demonstrating that LLoCO significantly outperforms in-context learning.
arXiv Detail & Related papers (2024-04-11T17:57:22Z) - Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues.
We equip each agent with the capability of sharing and reacting to images.
The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z) - Recursively Summarizing Enables Long-Term Dialogue Memory in Large
Language Models [75.98775135321355]
Given a long conversation, large language models (LLMs) fail to recall past information and tend to generate inconsistent responses.
We propose to generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability.
arXiv Detail & Related papers (2023-08-29T04:59:53Z) - Re$^3$Dial: Retrieve, Reorganize and Rescale Dialogue Corpus for
Long-Turn Open-Domain Dialogue Pre-training [90.3412708846419]
Most dialogues in existing pre-training corpora contain fewer than three turns of dialogue.
We propose the Retrieve, Reorganize and Rescale framework (Re$3$Dial) to automatically construct billion-scale long-turn dialogues.
By repeating the above process, Re$3$Dial can yield a coherent long-turn dialogue.
arXiv Detail & Related papers (2023-05-04T07:28:23Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - Controllable Dialogue Simulation with In-Context Learning [39.04491297557292]
textscDialogic is a dialogue simulation method based on large language model in-context learning.
Our method can rapidly expand a small set of dialogue data with minimum or zero human involvement.
Our simulated dialogues have near-human fluency and annotation accuracy.
arXiv Detail & Related papers (2022-10-09T06:32:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.