Semantic Compression With Large Language Models
- URL: http://arxiv.org/abs/2304.12512v1
- Date: Tue, 25 Apr 2023 01:47:05 GMT
- Title: Semantic Compression With Large Language Models
- Authors: Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse
Spencer-Smith, Jules White
- Abstract summary: Large language models (LLMs) are revolutionizing information retrieval, question answering, summarization, and code generation tasks.
LLMs are inherently limited by the number of input and output tokens that can be processed at once.
This paper presents three contributions to research on LLMs.
- Score: 1.0874100424278175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rise of large language models (LLMs) is revolutionizing information
retrieval, question answering, summarization, and code generation tasks.
However, in addition to confidently presenting factually inaccurate information
at times (known as "hallucinations"), LLMs are also inherently limited by the
number of input and output tokens that can be processed at once, making them
potentially less effective on tasks that require processing a large set or
continuous stream of information. A common approach to reducing the size of
data is through lossless or lossy compression. Yet, in some cases it may not be
strictly necessary to perfectly recover every detail from the original data, as
long as a requisite level of semantic precision or intent is conveyed.
This paper presents three contributions to research on LLMs. First, we
present the results from experiments exploring the viability of approximate
compression using LLMs, focusing specifically on GPT-3.5 and GPT-4 via ChatGPT
interfaces. Second, we investigate and quantify the capability of LLMs to
compress text and code, as well as to recall and manipulate compressed
representations of prompts. Third, we present two novel metrics -- Exact
Reconstructive Effectiveness (ERE) and Semantic Reconstruction Effectiveness
(SRE) -- that quantify the level of preserved intent between text compressed
and decompressed by the LLMs we studied. Our initial results indicate that
GPT-4 can effectively compress and reconstruct text while preserving the
semantic essence of the original text, providing a path to leverage
$\sim$5$\times$ more tokens than present limits allow.
Related papers
- BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression [91.23933111083389]
BRIEF (Bridging Retrieval and Inference through Evidence Fusion) is a lightweight approach that performs query-aware multi-hop reasoning.
Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries.
arXiv Detail & Related papers (2024-10-20T04:24:16Z) - A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens [20.37803751979975]
When feeding a text into an embedding model, the obtained text embedding will be aligned with the key tokens in the input text.
We show that this phenomenon is universal and is not affected by model architecture, training strategy, and embedding method.
By adjusting the first principal component, we can align text embedding with the key tokens.
arXiv Detail & Related papers (2024-06-25T08:55:12Z) - Training LLMs over Neurally Compressed Text [55.11828645767342]
This paper explores the idea of training large language models (LLMs) over highly compressed text.
We propose Equal-Info Windows, a novel compression technique whereby text is segmented into blocks that each compress to the same bit length.
We demonstrate effective learning over neurally compressed text that improves with scale, and outperforms byte-level baselines by a wide margin on perplexity and inference speed benchmarks.
arXiv Detail & Related papers (2024-04-04T17:48:28Z) - LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression [43.048684907893104]
This paper focuses on task-agnostic prompt compression for better generalizability and efficiency.
We formulate prompt compression as a token classification problem to guarantee the faithfulness of the compressed prompt to the original one.
Our approach leads to lower latency by explicitly learning the compression objective with smaller models such as XLM-RoBERTa-large and mBERT.
arXiv Detail & Related papers (2024-03-19T17:59:56Z) - RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective
Augmentation [61.53695868960846]
We propose compressing retrieved documents into textual summaries prior to in-context integration.
This not only reduces the computational costs but also relieves the burden of LMs to identify relevant information in long retrieved documents.
We show that our compressors trained for one LM can transfer to other LMs on the language modeling task and provide summaries largely faithful to the retrieved documents.
arXiv Detail & Related papers (2023-10-06T17:55:36Z) - Compressing LLMs: The Truth is Rarely Pure and Never Simple [90.05366363633568]
Knowledge-Intensive Compressed LLM BenchmarK aims to redefine the evaluation protocol for compressed Large Language Models.
LLM-KICK unveils many favorable merits and unfortunate plights of current SoTA compression methods.
LLM-KICK is designed to holistically access compressed LLMs' ability for language understanding, reasoning, generation, in-context retrieval, in-context summarization, etc.
arXiv Detail & Related papers (2023-10-02T17:42:37Z) - In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots.
ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z) - LeTI: Learning to Generate from Textual Interactions [60.425769582343506]
We explore LMs' potential to learn from textual interactions (LETI) that not only check their correctness with binary labels but also pinpoint and explain errors in their outputs through textual feedback.
Our focus is the code generation task, where the model produces code based on natural language instructions.
LETI iteratively fine-tunes the model, using the objective LM, on a concatenation of natural language instructions, LM-generated programs, and textual feedback.
arXiv Detail & Related papers (2023-05-17T15:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.