Fugu-MT 論文翻訳(概要): From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

論文の概要: From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

arxiv url: http://arxiv.org/abs/2508.05662v1
Date: Thu, 31 Jul 2025 14:03:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-17 22:58:06.129095
Title: From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base
Title（参考訳）: 静的から動的へ:リアルタイム知識ベースへのストリーミングRAGアプローチ
Authors: Yuzhou Zhu,
Abstract要約: Streaming RAGは、コサインスクリーニング、ミニバッチクラスタリング、およびコンパクトなプロトタイプセットを維持するためにヘビーヒッターフィルタを組み合わせた統一パイプラインである。 8つのリアルタイムストリームの実験では、Recall@10(最大3ポイント、p 0.01)、エンドツーエンドのレイテンシが15ミリ秒未満、スループットが150MBの予算で毎秒900ドキュメント以上である。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dynamic streams from news feeds, social media, sensor networks, and financial markets challenge static RAG frameworks. Full-scale indices incur high memory costs; periodic rebuilds introduce latency that undermines data freshness; naive sampling sacrifices semantic coverage. We present Streaming RAG, a unified pipeline that combines multi-vector cosine screening, mini-batch clustering, and a counter-based heavy-hitter filter to maintain a compact prototype set. We further prove an approximation bound \$E\[R(K\_t)] \ge R^\* - L \Delta\$ linking retrieval quality to clustering variance. An incremental index upsert mechanism refreshes prototypes without interrupting queries. Experiments on eight real-time streams show statistically significant gains in Recall\@10 (up to 3 points, p < 0.01), end-to-end latency below 15 ms, and throughput above 900 documents per second under a 150 MB budget. Hyperparameter sensitivity analysis over cluster count, admission probability, relevance threshold, and counter capacity validates default settings. In open-domain question answering with GPT-3.5 Turbo, we record 3.2-point gain in Exact Match and 2.8-point gain in F1 on SQuAD; abstractive summarization yields ROUGE-L improvements. Streaming RAG establishes a new Pareto frontier for retrieval augmentation.
Abstract（参考訳）: ニュースフィード、ソーシャルメディア、センサーネットワーク、金融市場からの動的ストリームは、静的なRAGフレームワークに挑戦する。フルスケールのインデックスはメモリコストが高く、周期的な再構築はデータの更新性を損なう遅延を導入し、単純なサンプリングはセマンティックカバレッジを犠牲にする。本稿では,マルチベクトルコサインスクリーニング,ミニバッチクラスタリング,および対向型重ヒッタフィルタを組み合わせた,コンパクトなプロトタイプセットを維持する統一パイプラインであるStreaming RAGを提案する。さらに、探索品質とクラスタリング分散をリンクする近似境界 \$E\[R(K\_t)] \ge R^\* - L \Delta\$ を証明した。インクリメンタルインデックスアップサートメカニズムは、クエリを中断することなくプロトタイプを更新する。 8つのリアルタイムストリームの実験では、Recall\@10(最大3ポイント、p < 0.01)、エンドツーエンドのレイテンシが15ミリ秒未満、スループットが150MBの予算で毎秒900ドキュメント以上である。クラスタカウント、アクセプション確率、関連しきい値、カウンタキャパシティに対するハイパーパラメータ感度分析は、デフォルト設定を検証する。 GPT-3.5 Turbo を用いたオープンドメイン質問応答では,SQuAD 上では Exact Match で 3.2-point gain と F1 で 2.8-point gain を記録し,抽象的な要約によりROUGE-L が向上する。 Streaming RAGは、検索拡張のための新しいParetoフロンティアを確立する。

論文の概要: From Static to Dynamic: A Streaming RAG Approach to Real-time Knowledge Base

関連論文リスト