Fugu-MT 論文翻訳(概要): When Global Gating Is Enough: Admission-Time Hubness Control in Anisotropic Vector Retrieval Systems

論文の概要: When Global Gating Is Enough: Admission-Time Hubness Control in Anisotropic Vector Retrieval Systems

arxiv url: http://arxiv.org/abs/2606.19692v1
Date: Thu, 18 Jun 2026 01:40:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.601958
Title: When Global Gating Is Enough: Admission-Time Hubness Control in Anisotropic Vector Retrieval Systems
Title（参考訳）: グローバルゲーティングが十分である場合:異方性ベクトル検索システムにおけるアドミッション時間ハネス制御
Authors: Prashant Kumar Pathak, Tarun Kumar Sharma,
Abstract要約: 挿入前,各候補をセンチネルクエリに対してスコアリングし,ハブライクな文書を隔離し,入場時間制御について検討した。 2つの10万のドキュメントコーパス、5つのエンコーダ、そして接続不能な攻撃者およびディフェンダークエリセットにまたがって、グローバルゲートは決定的な埋め込みスペースポイントでリコール1.0を達成する。 HNSWでは、摂取遅延が約3.1%増加し、スコアは106ベクトルに一定であり、決定の1.2%は近似インデックス化の下で反転する。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vector hubness, where a few points become nearest neighbors of many queries, creates a poisoning risk in retrieval-augmented generation (RAG): one injected document can influence unrelated requests. Existing defenses use periodic reverse-kNN scans, leaving an exposure window and repeated corpus-wide work. We study admission-time control, scoring each candidate against sentinel queries and quarantining hub-like documents before insertion. Across two 100,000-document corpora, five encoders, and disjoint attacker and defender query sets, a global gate achieves recall 1.0 at the decisive embedding-space point (>=0.92 across the effective range) and 0.91 +/- 0.07 on HotFlip attacks, with 1% false positives on general documents. A per-topic gate provides no reliable benefit, consistent with anisotropy coupling local and global visibility. Thresholds are maintained incrementally, with corpus-size-independent insertion cost and amortized deletion cost. On HNSW, admission adds about 3.1% to ingestion latency, scoring remains flat to 10^6 vectors, and 1.2% of decisions flip under approximate indexing, none involving attacks. Provenance complements the gate for natural or tight-domain hubs.
Abstract（参考訳）: ベクトルのハブ性(Vector Hubness)は、多くのクエリの隣り合う数ポイントに近づき、検索強化世代(RAG)において有毒なリスクを生じさせる。既存の防御は定期的なリバースkNNスキャンを使用しており、露光窓とコーパス全体の作業を繰り返し残している。挿入前,各候補をセンチネルクエリに対してスコアリングし,ハブライクな文書を隔離し,入場時間制御について検討した。 2つの10万のドキュメントコーパス、5つのエンコーダ、および接続不能なアタッカーとディフェンダーのクエリセットにまたがって、グローバルゲートは、決定的な埋め込み空間点(==0.92)でリコール1.0を達成し、HotFlip攻撃では0.91 +/- 0.07、一般文書では1%の偽陽性を達成している。トピックごとのゲートは、局所的な可視性とグローバルな可視性を結合する異方性と整合して、信頼性の高い利益を提供する。閾値は、コーパスサイズに依存しない挿入コストと償却削除コストで漸進的に維持される。 HNSWでは、摂取遅延が約3.1%増加し、スコアは10^6ベクターに一定であり、決定の1.2%は近似インデックス化の下で反転し、攻撃には関与しない。自然または密なドメインハブのゲートを補完する。

論文の概要: When Global Gating Is Enough: Admission-Time Hubness Control in Anisotropic Vector Retrieval Systems

関連論文リスト