Fugu-MT 論文翻訳(概要): InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

論文の概要: InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

arxiv url: http://arxiv.org/abs/2402.04617v1
Date: Wed, 7 Feb 2024 06:50:42 GMT
ステータス: 翻訳完了
システム内更新日: 2024-02-08 16:27:29.506333
Title: InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory
Title（参考訳）: infllm: トレーニングフリーメモリを用いた超長列理解のためのllmの固有能力
Authors: Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Song Han, Maosong Sun
Abstract要約: InfLLMは、リモートコンテキストを追加のメモリユニットに格納し、トークン関連ユニットを注目するために効率的なメカニズムを使用する。本稿では,LLMのストリーミング長列処理能力を明らかにするために,トレーニング不要なメモリベースのInfLLMを提案する。
参考スコア（独自算出の注目度）: 99.22913822705523
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have emerged as a cornerstone in real-world applications with lengthy streaming inputs, such as LLM-driven agents. However, existing LLMs, pre-trained on sequences with restricted maximum length, cannot generalize to longer sequences due to the out-of-domain and distraction issues. To alleviate these issues, existing efforts employ sliding attention windows and discard distant tokens to achieve the processing of extremely long sequences. Unfortunately, these approaches inevitably fail to capture long-distance dependencies within sequences to deeply understand semantics. This paper introduces a training-free memory-based method, InfLLM, to unveil the intrinsic ability of LLMs to process streaming long sequences. Specifically, InfLLM stores distant contexts into additional memory units and employs an efficient mechanism to lookup token-relevant units for attention computation. Thereby, InfLLM allows LLMs to efficiently process long sequences while maintaining the ability to capture long-distance dependencies. Without any training, InfLLM enables LLMs pre-trained on sequences of a few thousand tokens to achieve superior performance than competitive baselines continually training these LLMs on long sequences. Even when the sequence length is scaled to $1,024$K, InfLLM still effectively captures long-distance dependencies.
Abstract（参考訳）: 大規模言語モデル(LLM)は、LLM駆動エージェントのような長いストリーミング入力を持つ現実世界のアプリケーションにおいて、基盤として現れている。しかし、最大長が制限されたシーケンスで事前訓練された既存のLLMでは、ドメイン外問題や乱れの問題により、長いシーケンスに一般化できない。これらの問題を緩和するため、既存の作業では、非常に長いシーケンスの処理を実現するために、スライディングアテンションウィンドウを採用し、遠くのトークンを捨てている。残念ながら、これらのアプローチは必然的に、セマンティクスを深く理解するためにシーケンス内の長距離依存性を捉えることができない。本稿では,LLMのストリーミング長列処理能力を明らかにするために,トレーニング不要なメモリベースのInfLLMを提案する。特に、InfLLMは、遠隔コンテキストを追加のメモリ単位に格納し、注意計算のためにトークン関連ユニットを検索する効率的なメカニズムを用いる。これにより、InfLLMはLLMが長いシーケンスを効率的に処理できると同時に、長距離依存関係をキャプチャする機能も維持できる。トレーニングなしでは、InfLLMは数千のトークンのシーケンスで事前トレーニングされたLLMを、長いシーケンスでこれらのLLMを継続的にトレーニングする競争ベースラインよりも優れたパフォーマンスを達成することができる。シーケンス長が$1,024$Kにスケールしても、InfLLMは事実上長距離依存関係をキャプチャする。

論文の概要: InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

関連論文リスト