Related papers: Mathematical Models for Local Sensing Hashes

Mathematical Models for Local Sensing Hashes

URL: http://arxiv.org/abs/2111.08344v1
Date: Tue, 16 Nov 2021 10:40:55 GMT
Title: Mathematical Models for Local Sensing Hashes
Authors: Li Wang, Lilon Wangner
Abstract summary: We show that approximated index structures offer a good opportunity to accelerate the neighbor search for clustering and outlier detection. We indicate directions to mathematically model the properties of local sensing hashes.
Score: 7.400475825464313
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: As data volumes continue to grow, searches in data are becoming increasingly time-consuming. Classical index structures for neighbor search are no longer sustainable due to the "curse of dimensionality". Instead, approximated index structures offer a good opportunity to significantly accelerate the neighbor search for clustering and outlier detection and to have the lowest possible error rate in the results of the algorithms. Local sensing hashes is one of those. We indicate directions to mathematically model the properties of it.

Related papers

Score-matching-based Structure Learning for Temporal Data on Networks [17.166362605356074]
Causal discovery is a crucial initial step in establishing causality from empirical data and background knowledge. Current score-matching-based algorithms are primarily designed to analyze independent and identically distributed (i.i.d.) data. We have developed a new parent-finding subroutine for leaf nodes in DAGs, significantly accelerating the most time-consuming part of the process: the pruning step.
arXiv Detail & Related papers (2024-12-10T12:36:35Z)
Discovering Data Structures: Nearest Neighbor Search and Beyond [18.774836778996544]
We propose a general framework for end-to-end learning of data structures. Our framework adapts to the underlying data distribution and provides fine-grained control over query and space complexity. We first apply this framework to the problem of nearest neighbor search.
arXiv Detail & Related papers (2024-11-05T16:50:54Z)
Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? [62.57689536630933]
We provide experimental results on the BEIR dataset using the open-source Lucene search library. Our results provide guidance for today's search practitioner in understanding the design space of dense and sparse retrievers.
arXiv Detail & Related papers (2024-09-10T12:46:23Z)
Score matching enables causal discovery of nonlinear additive noise models [63.93669924730725]
We show how to design a new generation of scalable causal discovery methods. We propose a new efficient method for approximating the score's Jacobian, enabling to recover the causal graph.
arXiv Detail & Related papers (2022-03-08T21:34:46Z)
Probabilistic DAG Search [29.47649645431227]
We develop a probabilistic framework to exploit a search space's latent structure and share information across the search tree. We empirically find our algorithm to compare favorably to existing non-probabilistic alternatives in Tic-Tac-Toe and a feature selection application.
arXiv Detail & Related papers (2021-06-16T11:35:19Z)
Towards a Model for LSH [7.400475825464313]
Clustering and outlier detection algorithms are becoming increasingly time-consuming. We show how approximated index structures offer a good opportunity to accelerate the neighbor search for clustering and outlier detection.
arXiv Detail & Related papers (2021-05-11T15:39:55Z)
BR-NS: an Archive-less Approach to Novelty Search [70.13948372218849]
We discuss an alternative approach to novelty estimation, dubbed Behavior Recognition based Novelty Search (BR-NS) BR-NS does not require an archive, makes no assumption on the metrics that can be defined in the behavior space and does not rely on nearest neighbours search. We conduct experiments to gain insight into its feasibility and dynamics as well as potential advantages over archive-based NS in terms of time complexity.
arXiv Detail & Related papers (2021-04-08T17:31:34Z)
Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models. We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z)
The Case for Learned Spatial Indexes [62.88514422115702]
We use techniques proposed from a state-of-the art learned multi-dimensional index structure (namely, Flood) to answer spatial range queries. We show that (i) machine learned search within a partition is faster by 11.79% to 39.51% than binary search when using filtering on one dimension. We also refine using machine learned indexes is 1.23x to 1.83x times faster than closest competitor which filters on two dimensions.
arXiv Detail & Related papers (2020-08-24T12:09:55Z)
A Practical Index Structure Supporting Fr\'echet Proximity Queries Among Trajectories [1.9335262420787858]
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics. Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories. We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world data sets.
arXiv Detail & Related papers (2020-05-28T04:12:43Z)
Learning to Accelerate Heuristic Searching for Large-Scale Maximum Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc. Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource. We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.