Compression Represents Intelligence Linearly
- URL: http://arxiv.org/abs/2404.09937v2
- Date: Mon, 19 Aug 2024 13:55:42 GMT
- Title: Compression Represents Intelligence Linearly
- Authors: Yuzhen Huang, Jinghan Zhang, Zifei Shan, Junxian He,
- Abstract summary: Large language models (LLMs) have been shown to be equivalent to compression.
Despite such appealing discussions, little empirical evidence is present for the interplay between compression and intelligence.
Across 12 benchmarks, our study brings together 31 public LLMs that originate from diverse organizations.
Remarkably, we find that LLMs' intelligence almost linearly correlates with their ability to compress external text corpora.
- Score: 14.651664954289354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a belief that learning to compress well will lead to intelligence. Recently, language modeling has been shown to be equivalent to compression, which offers a compelling rationale for the success of large language models (LLMs): the development of more advanced language models is essentially enhancing compression which facilitates intelligence. Despite such appealing discussions, little empirical evidence is present for the interplay between compression and intelligence. In this work, we examine their relationship in the context of LLMs, treating LLMs as data compressors. Given the abstract concept of "intelligence", we adopt the average downstream benchmark scores as a surrogate, specifically targeting intelligence related to knowledge and commonsense, coding, and mathematical reasoning. Across 12 benchmarks, our study brings together 31 public LLMs that originate from diverse organizations. Remarkably, we find that LLMs' intelligence -- reflected by average benchmark scores -- almost linearly correlates with their ability to compress external text corpora. These results provide concrete evidence supporting the belief that superior compression indicates greater intelligence. Furthermore, our findings suggest that compression efficiency, as an unsupervised metric derived from raw text corpora, serves as a reliable evaluation measure that is linearly associated with the model capabilities. We open-source our compression datasets as well as our data collection pipelines to facilitate future researchers to assess compression properly.
Related papers
- SuperARC: An Agnostic Test for Narrow, General, and Super Intelligence Based On the Principles of Recursive Compression and Algorithmic Probability [0.14061979259370275]
We introduce an open-ended test grounded in algorithmic probability.
It can avoid benchmark contamination in the quantitative evaluation of frontier models.
We prove that compression is equivalent and directly proportional to a system's predictive power.
arXiv Detail & Related papers (2025-03-20T23:11:30Z) - LightThinker: Thinking Step-by-Step Compression [53.8069487638972]
We propose LightThinker, a method that enables large language models to dynamically compress intermediate thoughts during reasoning.
Inspired by human cognitive processes, LightThinker compresses thought steps into compact representations and discards the original reasoning chains.
Experiments show that LightThinker reduces peak memory usage and inference time, while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-02-21T16:57:22Z) - Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need [53.584140947828004]
Language large model (LLM) with unprecedented intelligence is a general-purpose lossless compressor for various data modalities.
We propose P$2$-LLM, a next-pixel prediction-based LLM, which integrates various elaborated insights and methodologies.
Experiments on benchmark datasets demonstrate that P$2$-LLM can beat SOTA classical and learned codecs.
arXiv Detail & Related papers (2024-11-19T12:15:40Z) - LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment [36.958867918858296]
Large language models (LLMs) have demonstrated their strong intelligence ability, but high demand for computation and storage hinders their practical application.
We present the Large Language Model Compression Benchmark (LLMCBench), a rigorously designed benchmark with an in-depth analysis for LLM compression algorithms.
arXiv Detail & Related papers (2024-10-28T14:45:01Z) - BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression [91.23933111083389]
BRIEF (Bridging Retrieval and Inference through Evidence Fusion) is a lightweight approach that performs query-aware multi-hop reasoning.
Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries.
arXiv Detail & Related papers (2024-10-20T04:24:16Z) - Familiarity-aware Evidence Compression for Retrieval Augmented Generation [33.13513003367646]
We propose FaviComp, a training-free evidence compression technique that makes retrieved evidence more familiar to the target model.
FaviComp proactively lowers the perplexity of the compressed evidence with regard to the target model.
Experimental results demonstrate that FaviComp consistently outperforms existing baselines in multiple open-domain QA.
arXiv Detail & Related papers (2024-09-19T05:14:55Z) - Ranking LLMs by compression [13.801767671391604]
We use five large language models as priors for compression, then compare their performance on challenging natural language processing tasks.
Experimental results show that compression ratio and model performance are positively correlated, so it can be used as a general metric to evaluate large language models.
arXiv Detail & Related papers (2024-06-20T10:23:38Z) - LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models.
We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization.
Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z) - The Cost of Compression: Investigating the Impact of Compression on
Parametric Knowledge in Language Models [11.156816338995503]
Large language models (LLMs) provide faster inference, smaller memory footprints, and enables local deployment.
Two standard compression techniques are pruning and quantization, with the former eliminating redundant connections in model layers and the latter representing model parameters with fewer bits.
Existing research on LLM compression primarily focuses on performance in terms of general metrics like perplexity or downstream task accuracy.
More fine-grained metrics, such as those measuring parametric knowledge, remain significantly underexplored.
arXiv Detail & Related papers (2023-12-01T22:27:12Z) - Compressing LLMs: The Truth is Rarely Pure and Never Simple [90.05366363633568]
Knowledge-Intensive Compressed LLM BenchmarK aims to redefine the evaluation protocol for compressed Large Language Models.
LLM-KICK unveils many favorable merits and unfortunate plights of current SoTA compression methods.
LLM-KICK is designed to holistically access compressed LLMs' ability for language understanding, reasoning, generation, in-context retrieval, in-context summarization, etc.
arXiv Detail & Related papers (2023-10-02T17:42:37Z) - In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots.
ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z) - Great Truths are Always Simple: A Rather Simple Knowledge Encoder for
Enhancing the Commonsense Reasoning Capacity of Pre-Trained Models [89.98762327725112]
Commonsense reasoning in natural language is a desired ability of artificial intelligent systems.
For solving complex commonsense reasoning tasks, a typical solution is to enhance pre-trained language models(PTMs) with a knowledge-aware graph neural network(GNN) encoder.
Despite the effectiveness, these approaches are built on heavy architectures, and can't clearly explain how external knowledge resources improve the reasoning capacity of PTMs.
arXiv Detail & Related papers (2022-05-04T01:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.