Related papers: Exact Trajectory Similarity Search With N-tree: An Efficient Metric Index for kNN and Range Queries

Exact Trajectory Similarity Search With N-tree: An Efficient Metric Index for kNN and Range Queries

URL: http://arxiv.org/abs/2408.07650v1
Date: Wed, 14 Aug 2024 16:21:28 GMT
Title: Exact Trajectory Similarity Search With N-tree: An Efficient Metric Index for kNN and Range Queries
Authors: Ralf Hartmut Güting, Suvam Kumar Das, Fabio Valdés, Suprio Ray,
Abstract summary: Similarity search is the problem of finding in a collection of objects that are similar to a given query object. Trajectories are recorded movements of mobile objects such as vehicles, animals, public transportation, or parts of the human body. We propose a novel distance function called DistanceAvg to capture the similarity of such movements.
Score: 2.8059675184983424
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Similarity search is the problem of finding in a collection of objects those that are similar to a given query object. It is a fundamental problem in modern applications and the objects considered may be as diverse as locations in space, text documents, images, twitter messages, or trajectories of moving objects. In this paper we are motivated by the latter application. Trajectories are recorded movements of mobile objects such as vehicles, animals, public transportation, or parts of the human body. We propose a novel distance function called DistanceAvg to capture the similarity of such movements. To be practical, it is necessary to provide indexing for this distance measure. Fortunately we do not need to start from scratch. A generic and unifying approach is metric space, which organizes the set of objects solely by a distance (similarity) function with certain natural properties. Our function DistanceAvg is a metric. Although metric indexes have been studied for decades and many such structures are available, they do not offer the best performance with trajectories. In this paper we propose a new design, which outperforms the best existing indexes for kNN queries and is equally good for range queries. It is especially suitable for expensive distance functions as they occur in trajectory similarity search. In many applications, kNN queries are more practical than range queries as it may be difficult to determine an appropriate search radius. Our index provides exact result sets for the given distance function.

Related papers

Range Retrieval with Graph-Based Indices [0.294944680995069]
Retrieving points based on proximity in a high-dimensional vector space is a crucial step in information retrieval applications. We present a set of algorithms for range retrieval on graph-based vector indices. We find up to 100x improvement in query throughput over a naive baseline approach, with 5-10x improvement on average, and strong performance up to 100 million data points.
arXiv Detail & Related papers (2025-02-18T19:18:01Z)
Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? [62.57689536630933]
We provide experimental results on the BEIR dataset using the open-source Lucene search library. Our results provide guidance for today's search practitioner in understanding the design space of dense and sparse retrievers.
arXiv Detail & Related papers (2024-09-10T12:46:23Z)
iRangeGraph: Improvising Range-dedicated Graphs for Range-filtering Nearest Neighbor Search [24.85572470526277]
Range-filtering approximate nearest neighbor (RFANN) search is attracting increasing attention in academia and industry. Recent study proposes to build $O(n2)$ dedicated graph-based indexes for all possible query ranges. We materialize graph-based indexes, called elemental graphs, for a moderate number of ranges.
arXiv Detail & Related papers (2024-09-04T09:41:52Z)
A general framework for distributed approximate similarity search with arbitrary distances [0.5030361857850012]
Similarity search is a central problem in domains such as information management and retrieval or data analysis. Many similarity search algorithms are designed or specifically adapted to metric distances. This paper presents GDASC, a general framework for distributed approximate similarity search that accepts arbitrary distances.
arXiv Detail & Related papers (2024-05-22T16:19:52Z)
Vector search with small radiuses [10.880913075221361]
This paper focuses on the common case where a hard decision needs to be taken depending on the vector retrieval results. We show that the value of a range search result can be modeled rigorously based on the query-to-vector distance. This yields a metric for range search, RSM, that is both principled and easy to compute without running an end-to-end evaluation.
arXiv Detail & Related papers (2024-03-16T00:34:25Z)
LIST: Learning to Index Spatio-Textual Data for Embedding based Spatial Keyword Queries [53.843367588870585]
List K-kNN spatial keyword queries (TkQs) return a list of objects based on a ranking function that considers both spatial and textual relevance. There are two key challenges in building an effective and efficient index, i.e., the absence of high-quality labels and the unbalanced results. We develop a novel pseudolabel generation technique to address the two challenges.
arXiv Detail & Related papers (2024-03-12T05:32:33Z)
Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection [2.3814052021083354]
This work presents an adaptive group testing framework for the range-based high dimensional near neighbor search problem. Our method efficiently marks each item in a database as neighbor or non-neighbor of a query point, based on a cosine distance threshold without exhaustive search. We show that, using softmax-based features, our method achieves a more than ten-fold speed-up over exhaustive search with no loss of accuracy.
arXiv Detail & Related papers (2023-11-05T06:12:03Z)
A Metaheuristic Algorithm for Large Maximum Weight Independent Set Problems [58.348679046591265]
Given a node-weighted graph, find a set of independent (mutually nonadjacent) nodes whose node-weight sum is maximum. Some of the graphs airsing in this application are large, having hundreds of thousands of nodes and hundreds of millions of edges. We develop a new local search algorithm, which is a metaheuristic in the greedy randomized adaptive search framework.
arXiv Detail & Related papers (2022-03-28T21:34:16Z)
BR-NS: an Archive-less Approach to Novelty Search [70.13948372218849]
We discuss an alternative approach to novelty estimation, dubbed Behavior Recognition based Novelty Search (BR-NS) BR-NS does not require an archive, makes no assumption on the metrics that can be defined in the behavior space and does not rely on nearest neighbours search. We conduct experiments to gain insight into its feasibility and dynamics as well as potential advantages over archive-based NS in terms of time complexity.
arXiv Detail & Related papers (2021-04-08T17:31:34Z)
Neural Architecture Search From Fr\'echet Task Distance [50.9995960884133]
We show how the distance between a target task and each task in a given set of baseline tasks can be used to reduce the neural architecture search space for the target task. The complexity reduction in search space for task-specific architectures is achieved by building on the optimized architectures for similar tasks instead of doing a full search without using this side information.
arXiv Detail & Related papers (2021-03-23T20:43:31Z)
The Case for Learned Spatial Indexes [62.88514422115702]
We use techniques proposed from a state-of-the art learned multi-dimensional index structure (namely, Flood) to answer spatial range queries. We show that (i) machine learned search within a partition is faster by 11.79% to 39.51% than binary search when using filtering on one dimension. We also refine using machine learned indexes is 1.23x to 1.83x times faster than closest competitor which filters on two dimensions.
arXiv Detail & Related papers (2020-08-24T12:09:55Z)
A Practical Index Structure Supporting Fr\'echet Proximity Queries Among Trajectories [1.9335262420787858]
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics. Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories. We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world data sets.
arXiv Detail & Related papers (2020-05-28T04:12:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.