Fugu-MT 論文翻訳(概要): Generalized Range Filtering Approximate Nearest Neighbor Search: Containment and Overlap [Technical Report]

論文の概要: Generalized Range Filtering Approximate Nearest Neighbor Search: Containment and Overlap [Technical Report]

arxiv url: http://arxiv.org/abs/2605.26474v1
Date: Tue, 26 May 2026 02:31:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-27 17:51:41.584451
Title: Generalized Range Filtering Approximate Nearest Neighbor Search: Containment and Overlap [Technical Report]
Title（参考訳）: 一般化レンジフィルタ近似近傍探索:包含とオーバーラップ [技術報告]
Authors: Yingfan Liu, Tong Wu, Jiadong Xie, Yang Zhao, Jeffrey Xu Yu, Jiangtao Cui,
Abstract要約: 距離フィルタを用いた近似近接探索(ANN)が近年注目されている。この論文は、この問題の一般化形式、すなわち、範囲値属性に基づく正確な範囲範囲(RR)述語を用いたANNサーチ、RRフィルタリングANN (RRANN) に展開する。我々は、任意のRR述語を効率的に扱うマルチセグメント木グラフという新しいアプローチを導入する。実世界のデータを用いた実験は、RRANNクエリにおける我々のアプローチの有効性を示し、ベースラインと同じ精度で最大12.5倍の高速化を実現した。
参考スコア（独自算出の注目度）: 25.485076435677957
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Approximate nearest neighbor (ANN) search with range filters has recently garnered significant attention. This paper delves into a generalized form of this problem, i.e., ANN search with exact range-range (RR) predicates on a range-valued attribute, named RR filtering ANN (RRANN). Specifically, given $n$ vectors in $\mathbb{R}^d$, each vector $v_i$ is associated with a numeric range $[l_i, r_i]$, symbolizing aspects like a price range or time interval. An RRANN query $(v_q, l_q, r_q)$ aims at finding $k$ vectors closest to $v_q$ within the vectors satisfying an arbitrary RR predicate defined between the query range $[l_q, r_q]$ and the object range $[l_i, r_i]$. The RR predicate remains unspecified, enabling user-defined conditions. It may encompass containment ($[l_i, r_i] \subseteq [l_q, r_q]$ or $[l_q, r_q] \subseteq [l_i, r_i]$), overlap ($l_i \le l_q \le r_i \le r_q$ or $l_q \le l_i \le r_q \le r_i$), or a disjunction of them. RRANN has broad applications in queries related to price ranges or time intervals, and it generalizes existing variants of ANN search with range filters. However, existing dedicated approaches for these problems lack the capacity to support queries with arbitrary RR predicates. Hence, we introduce a new approach, labeled multi-segment tree graph. It efficiently handles arbitrary RR predicates by avoiding traversal through non-predicate-satisfied nodes, and keeps equivalent index size and construction time to state-of-the-art methods for RFANN. Extensive experiments on real-world data demonstrate the efficacy of our approach in RRANN queries, achieving up to 12.5x speedups with the same accuracy as the baselines. Moreover, our approach attains comparable RFANN search performance and notably superior IFANN and TSANN search performance compared to the respective state-of-the-art approaches. Our code is available at https://github.com/FanEDG/MSTG.
Abstract（参考訳）: 距離フィルタを用いた近似近接探索(ANN)が近年注目されている。この論文は、この問題の一般化された形式、すなわち、範囲値の属性に基づいて正確な範囲範囲(RR)を述語するANNサーチ(RR filtering ANN (RRANN))に展開する。具体的には、$\mathbb{R}^d$の$n$ベクトルが与えられた場合、各ベクトル $v_i$ は数値範囲 $[l_i, r_i]$ に関連付けられ、価格範囲や時間間隔のような側面を象徴する。 RRANNクエリ$(v_q, l_q, r_q)$は、クエリ範囲$[l_q, r_q]$とオブジェクト範囲$[l_i, r_i]$の間で定義された任意のRR述語を満たすベクトル内で、$v_q$に最も近い$k$ベクトルを見つけることを目的としている。 RR述語は未定のままであり、ユーザ定義条件が可能である。包含物([l_i, r_i] \subseteq [l_q, r_q]$または$[l_q, r_q] \subseteq [l_i, r_i]$)、重複物(l_i \le l_q \le r_i \le r_q$または$l_q \le l_i \le r_q \le r_i$)を含むことができる。 RRANNは、価格範囲や時間間隔に関するクエリに広く応用されており、レンジフィルタによる既存のANN検索の変種を一般化している。しかし、これらの問題に対する既存の専用のアプローチは、任意のRR述語でクエリをサポートする能力に欠けていた。そこで我々は,マルチセグメント木グラフをラベル付けした新しい手法を提案する。 RFANNの最先端手法に等価なインデックスサイズと構築時間を保持することで、任意のRR述語を効率よく処理する。実世界のデータに対する大規模な実験は、RRANNクエリにおける我々のアプローチの有効性を示し、ベースラインと同じ精度で最大12.5倍の高速化を実現した。さらに,本手法は,各最先端手法と比較して,比較可能なRFANN検索性能,特にIFANNとTSANN検索性能に優れる。私たちのコードはhttps://github.com/FanEDG/MSTG.comで利用可能です。

論文の概要: Generalized Range Filtering Approximate Nearest Neighbor Search: Containment and Overlap [Technical Report]

関連論文リスト