Fugu-MT 論文翻訳(概要): GPU-Resident Inverted File Index for Streaming Vector Databases

論文の概要: GPU-Resident Inverted File Index for Streaming Vector Databases

arxiv url: http://arxiv.org/abs/2601.11808v1
Date: Fri, 16 Jan 2026 22:20:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-28 13:19:18.642615
Title: GPU-Resident Inverted File Index for Streaming Vector Databases
Title（参考訳）: ストリームベクトルデータベースのためのGPU-Resident Inverted File Index
Authors: Dongfang Zhao,
Abstract要約: SIVF(Streaming Inverted File)は,高速なデータ取り込みと削除機能を備えたベクトルデータベースを実現するために設計された,GPUネイティブなアーキテクチャである。 SIVFは静的メモリレイアウトをスラブベースのアロケーションシステムと有効ビットマップに置き換え、VRAMに直接ロックフリーおよびインプレース変更を可能にする。 SIFT1MおよびGIST1Mデータセット上での業界標準GPU IVF実装に対するSIVFの評価を行った。
参考スコア（独自算出の注目度）: 0.9179857807576733
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Vector search has emerged as the computational backbone of modern AI infrastructure, powering critical systems ranging from Vector Databases to Retrieval-Augmented Generation (RAG). While the GPU-accelerated Inverted File (IVF) index acts as one of the most widely used techniques for these large-scale workloads due to its memory efficiency, its traditional architecture remains fundamentally static. Existing designs rely on rigid and contiguous memory layouts that lack native support for in-place mutation, creating a severe bottleneck for streaming scenarios. In applications requiring real-time knowledge updates, such as live recommendation engines or dynamic RAG systems, maintaining index freshness necessitates expensive CPU-GPU roundtrips that cause system latency to spike from milliseconds to seconds. In this paper, we propose SIVF (Streaming Inverted File), a new GPU-native architecture designed to empower vector databases with high-velocity data ingestion and deletion capabilities. SIVF replaces the static memory layout with a slab-based allocation system and a validity bitmap, enabling lock-free and in-place mutation directly in VRAM. We further introduce a GPU-resident address translation table (ATT) to resolve the overhead of locating vectors, providing $O(1)$ access to physical storage slots. We evaluate SIVF against the industry-standard GPU IVF implementation on the SIFT1M and GIST1M datasets. Microbenchmarks demonstrate that SIVF reduces deletion latency by up to $13,300\times$ (from 11.8 seconds to 0.89 ms on GIST1M) and improves ingestion throughput by $36\times$ to $105\times$. In end-to-end sliding window scenarios, SIVF eliminates system freezes and achieves a $161\times$ to $266\times$ speedup with single-digit millisecond latency. Notably, this performance incurs negligible storage penalty, maintaining less than 0.8\% memory overhead compared to static indices.
Abstract（参考訳）: ベクトル検索は、Vector DatabasesからRetrieval-Augmented Generation (RAG)までの重要なシステムを動かす、現代のAIインフラストラクチャの計算バックボーンとして登場した。 GPUアクセラレーションされたInverted File(IVF)インデックスは、メモリ効率のためにこれらの大規模ワークロードで最も広く使用されるテクニックの1つだが、従来のアーキテクチャは基本的に静的である。既存の設計は厳格で連続的なメモリレイアウトに依存しており、インプレース変異をネイティブにサポートしていないため、ストリーミングシナリオに深刻なボトルネックが生じる。ライブレコメンデーションエンジンや動的RAGシステムなどのリアルタイム知識更新を必要とするアプリケーションでは、インデックスの鮮度を維持するには高価なCPU-GPUラウンドトリップが必要であるため、システムのレイテンシはミリ秒から秒に急上昇する。本稿では,高速なデータ取り込みと削除機能を備えたベクトルデータベースを実現するために設計された,GPUネイティブな新しいアーキテクチャであるSIVF(Streaming Inverted File)を提案する。 SIVFは静的メモリレイアウトをスラブベースのアロケーションシステムと有効ビットマップに置き換え、VRAMに直接ロックフリーおよびインプレース変更を可能にする。さらに、ベクトルの位置決めのオーバーヘッドを解決するために、GPU-resident address translation table (ATT)を導入し、物理ストレージスロットへの$O(1)$アクセスを提供する。 SIFT1MおよびGIST1Mデータセット上での業界標準GPU IVF実装に対するSIVFの評価を行った。 Microbenchmarksによると、SIVFは削除遅延を最大13,300\times$(GIST1Mで11.8秒から0.89ms)削減し、摂取スループットを36\times$から105\times$に改善している。エンドツーエンドのスライディングウィンドウのシナリオでは、SIVFはシステム凍結を排除し、シングル桁ミリ秒のレイテンシで161\times$から266\times$スピードアップを達成する。このパフォーマンスは無視可能なストレージペナルティを発生させ、静的インデックスと比較して 0.8 % 未満のメモリオーバーヘッドを維持する。

論文の概要: GPU-Resident Inverted File Index for Streaming Vector Databases

関連論文リスト