Fugu-MT 論文翻訳(概要): RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

論文の概要: RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

arxiv url: http://arxiv.org/abs/2511.03475v1
Date: Wed, 05 Nov 2025 13:59:01 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-06 18:19:32.439177
Title: RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse
Title（参考訳）: RAGBoost: 精度保存コンテキスト再利用による効率的な検索拡張生成
Authors: Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai,
Abstract要約: Retrieval-augmented Generation (RAG)は、検索コンテキストで大きな言語モデル(LLM)を拡張する。既存のキャッシュ技術は、低いキャッシュ再利用で精度を維持するか、劣化した推論品質で再利用を改善するかのどちらかである。 RAGBoostは、精度保存コンテキストの再利用によって精度を犠牲にすることなく、高いキャッシュ再利用を実現する効率的なRAGシステムである。
参考スコア（独自算出の注目度）: 39.76548092849437
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) with retrieved context but often suffers from downgraded prefill performance as modern applications demand longer and more complex inputs. Existing caching techniques either preserve accuracy with low cache reuse or improve reuse at the cost of degraded reasoning quality. We present RAGBoost, an efficient RAG system that achieves high cache reuse without sacrificing accuracy through accuracy-preserving context reuse. RAGBoost detects overlapping retrieved items across concurrent sessions and multi-turn interactions, using efficient context indexing, ordering, and de-duplication to maximize reuse, while lightweight contextual hints maintain reasoning fidelity. It integrates seamlessly with existing LLM inference engines and improves their prefill performance by 1.5-3X over state-of-the-art methods, while preserving or even enhancing reasoning accuracy across diverse RAG and agentic AI workloads. Our code is released at: https://github.com/Edinburgh-AgenticAI/RAGBoost.
Abstract（参考訳）: Retrieval-augmented Generation (RAG) は、検索されたコンテキストで大きな言語モデル(LLM)を拡張するが、現代のアプリケーションがより長く複雑な入力を要求するため、しばしば前処理性能の低下に悩まされる。既存のキャッシュ技術は、低いキャッシュ再利用で精度を維持するか、劣化した推論品質で再利用を改善するかのどちらかである。 RAGBoostは、精度保存コンテキストの再利用によって精度を犠牲にすることなく、高いキャッシュ再利用を実現する効率的なRAGシステムである。 RAGBoostは、効率的なコンテキストインデックス、順序付け、非重複を使用して、同時セッションとマルチターンインタラクションをまたいだ重複したアイテムを検出し、再利用を最大化する一方で、軽量なコンテキストヒントは推論の忠実性を維持する。既存のLLM推論エンジンとシームレスに統合し、最先端のメソッドよりも1.5～3倍のプリフィルパフォーマンスを向上するとともに、さまざまなRAGおよびエージェントAIワークロード間の推論精度の保存あるいは向上を実現している。私たちのコードは、https://github.com/Edinburgh-AgenticAI/RAGBoost.comでリリースされています。

論文の概要: RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

関連論文リスト