Fugu-MT 論文翻訳(概要): Context Cascade Compression: Exploring the Upper Limits of Text Compression

論文の概要: Context Cascade Compression: Exploring the Upper Limits of Text Compression

arxiv url: http://arxiv.org/abs/2511.15244v1
Date: Wed, 19 Nov 2025 09:02:56 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-20 15:51:28.709508
Title: Context Cascade Compression: Exploring the Upper Limits of Text Compression
Title（参考訳）: コンテキストカスケード圧縮:テキスト圧縮の上限を探索する
Authors: Fanfan Liu, Haibo Qiu,
Abstract要約: テキスト圧縮の上限を探索するためにContext Cascade Compression C3を導入する。圧縮率20倍では,DeepSeek-OCRの約60%と比較して98%の復号精度が得られた。これは、文脈圧縮の領域において、C3圧縮は光学的文字圧縮よりも優れた性能と実現可能性を示すことを示している。
参考スコア（独自算出の注目度）: 3.013064618174921
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Million-level token inputs in long-context tasks pose significant computational and memory challenges for Large Language Models (LLMs). Recently, DeepSeek-OCR conducted research into the feasibility of Contexts Optical Compression and achieved preliminary results. Inspired by this, we introduce Context Cascade Compression C3 to explore the upper limits of text compression. Our method cascades two LLMs of different sizes to handle the compression and decoding tasks. Specifically, a small LLM, acting as the first stage, performs text compression by condensing a long context into a set of latent tokens (e.g., 32 or 64 in length), achieving a high ratio of text tokens to latent tokens. A large LLM, as the second stage, then executes the decoding task on this compressed context. Experiments show that at a 20x compression ratio (where the number of text tokens is 20 times the number of latent tokens), our model achieves 98% decoding accuracy, compared to approximately 60% for DeepSeek-OCR. When we further increase the compression ratio to 40x, the accuracy is maintained at around 93%. This indicates that in the domain of context compression, C3 Compression demonstrates superior performance and feasibility over optical character compression. C3 uses a simpler, pure-text pipeline that ignores factors like layout, color, and information loss from a visual encoder. This also suggests a potential upper bound for compression ratios in future work on optical character compression, OCR, and related fields. Codes and model weights are publicly accessible at https://github.com/liufanfanlff/C3-Context-Cascade-Compression
Abstract（参考訳）: 長文タスクにおける数百万レベルのトークン入力は、Large Language Models (LLM) において重要な計算とメモリの問題を引き起こす。近年、DeepSeek-OCRはコンテキスト光圧縮の実現可能性の研究を行い、予備的な結果を得た。そこで我々はContext Cascade Compression C3を導入し,テキスト圧縮の上限について検討する。本手法は,圧縮処理と復号処理を行うために,異なるサイズの2つのLLMをカスケードする。特に、第1段階として機能する小さなLCMは、長いコンテキストを潜在トークンのセット(例えば、32または64)に凝縮してテキスト圧縮を行い、潜在トークンに対するテキストトークンの比率を高くする。次に、大きなLLMが第2段階として、圧縮されたコンテキスト上でデコードタスクを実行する。実験の結果,20倍圧縮率(テキストトークンの数が潜在トークンの20倍)では,DeepSeek-OCRの約60%に比べて98%の復号精度が得られた。さらに圧縮比を40倍にすると、約93%の精度で精度が維持される。これは、文脈圧縮の領域において、C3圧縮は光学的文字圧縮よりも優れた性能と実現可能性を示すことを示している。 C3は、レイアウト、色、およびビジュアルエンコーダからの情報損失などの要素を無視した、シンプルで純粋なテキストパイプラインを使用する。これはまた、将来の光学的文字圧縮、OCR、および関連分野における圧縮比の潜在的上限も示唆している。 codes and model weights are public access at https://github.com/liufanfanlff/C3-Context-Cascade-Compression

論文の概要: Context Cascade Compression: Exploring the Upper Limits of Text Compression

関連論文リスト