Fugu-MT 論文翻訳(概要): Coordination-Free Lane Partitioning for Convergent ANN Search

論文の概要: Coordination-Free Lane Partitioning for Convergent ANN Search

arxiv url: http://arxiv.org/abs/2511.04221v1
Date: Thu, 06 Nov 2025 09:36:18 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-07 20:17:53.379103
Title: Coordination-Free Lane Partitioning for Convergent ANN Search
Title（参考訳）: 収束ANN探索のためのコーディネーションフリーレーン分割
Authors: Carl Kugblenu, Petri Vuorimaa,
Abstract要約: 生産ベクトルサーチシステムは、遅延サービスレベル目標(SLO)を満たすために、並列レーンにまたがる各クエリをファンアウトすることが多い。複製を相補的な作業に同じコストと期限で変換する調整不要レーン分割器を提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Production vector search systems often fan out each query across parallel lanes (threads, replicas, or shards) to meet latency service-level objectives (SLOs). In practice, these lanes rediscover the same candidates, so extra compute does not increase coverage. We present a coordination-free lane partitioner that turns duplication into complementary work at the same cost and deadline. For each query we (1) build a deterministic candidate pool sized to the total top-k budget, (2) apply a per-query pseudorandom permutation, and (3) assign each lane a disjoint slice of positions. Lanes then return different results by construction, with no runtime coordination. At equal cost with four lanes (total candidate budget 64), on SIFT1M (1M SIFT feature vectors) with Hierarchical Navigable Small World graphs (HNSW) recall@10 rises from 0.249 to 0.999 while lane overlap falls from nearly 100% to 0%. On MS MARCO (8.8M passages) with HNSW, hit@10 improves from 0.200 to 0.601 and Mean Reciprocal Rank at 10 (MRR@10) from 0.133 to 0.330. For inverted file (IVF) indexes we see smaller but consistent gains (for example, +11% on MS MARCO) by de-duplicating list routing. A microbenchmark shows planner overhead of ~37 microseconds per query (mean at the main setting) with linear growth in the number of merged candidates. These results yield a simple operational guideline: size the per-query pool to the total budget, deterministically partition positions across lanes, and turn redundant fan-out into complementary coverage without changing budget or deadline.
Abstract（参考訳）: プロダクションベクターサーチシステムは、遅延サービスレベルの目的(SLO)を満たすために、並列レーン(スレッド、レプリカ、シャード)にまたがるクエリをファンアウトすることが多い。実際には、これらのレーンは同じ候補を再発見するため、余分な計算ではカバレッジは増加しない。複製を相補的な作業に同じコストと期限で変換する調整不要レーン分割器を提案する。各クエリに対して,(1)全トップk予算に匹敵する決定論的候補プールを構築し,(2)クエリごとの擬似乱数置換を適用し,(3)各レーンに不連続な位置のスライスを割り当てる。ランは、実行時の調整なしに、異なる結果をコンストラクションによって返します。 SIFT1M(1M SIFT特徴ベクトル)の階層的ナビゲート可能な小型世界グラフ(HNSW)リコール@10は0.249から0.999に上昇する一方、レーンオーバーラップは100%から0%に低下する。 HNSWによるMS MARCO (8.8Mパス)では、 hit@10 は 0.200 から 0.601 に改善され、平均 Reciprocal Rank (MRR@10) は 0.133 から 0.330 に改善された。逆ファイル(IVF)インデックスの場合、リストルーティングの非重複化により、より小さいが一貫したゲイン(例えばMS MARCOでは+11%)が見られる。マイクロベンチマークは、クエリ毎に約37マイクロ秒のプランナーオーバーヘッドを示し、マージされた候補の数は線形に増加する。これらの結果は、クエリ単位のプールを総予算にサイズし、レーンを横断する位置を決定的に分割し、冗長なファンアウトを予算や期限を変更することなく補完的なカバレッジに変換するという単純な運用ガイドラインをもたらす。

論文の概要: Coordination-Free Lane Partitioning for Convergent ANN Search

関連論文リスト