Related papers: Towards a Model for LSH

Towards a Model for LSH

URL: http://arxiv.org/abs/2105.05130v1
Date: Tue, 11 May 2021 15:39:55 GMT
Title: Towards a Model for LSH
Authors: Li Wang
Abstract summary: Clustering and outlier detection algorithms are becoming increasingly time-consuming. We show how approximated index structures offer a good opportunity to accelerate the neighbor search for clustering and outlier detection.
Score: 7.400475825464313
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: As data volumes continue to grow, clustering and outlier detection algorithms are becoming increasingly time-consuming. Classical index structures for neighbor search are no longer sustainable due to the "curse of dimensionality". Instead, approximated index structures offer a good opportunity to significantly accelerate the neighbor search for clustering and outlier detection and to have the lowest possible error rate in the results of the algorithms. Locality-sensitive hashing is one of those. We indicate directions to model the properties of LSH.

Related papers

Learning to Guide Local Search for MPE Inference in Probabilistic Graphical Models [7.287294240824019]
Most Probable Explanation (MPE) inference in Probabilistic Graphical Models (PGMs) is a fundamental yet computationally challenging problem.<n>We propose a neural amortization framework for improving local search in repeated-Query regime.<n>We provide theoretical intuition linking distance-reducing move selection to improved promise during neighbor selection.
arXiv Detail & Related papers (2026-02-01T22:43:28Z)
Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning [4.838245844232486]
We present FLOP, a score-based causal discovery algorithm for linear models.<n>It pairs fast parent selection with iterative Cholesky-based score updates, cutting run-times over prior algorithms.<n>The resulting structures are highly accurate across benchmarks, with near-perfect recovery in standard settings.
arXiv Detail & Related papers (2025-10-06T16:04:53Z)
Score-matching-based Structure Learning for Temporal Data on Networks [17.166362605356074]
Causal discovery is a crucial initial step in establishing causality from empirical data and background knowledge. Current score-matching-based algorithms are primarily designed to analyze independent and identically distributed (i.i.d.) data. We have developed a new parent-finding subroutine for leaf nodes in DAGs, significantly accelerating the most time-consuming part of the process: the pruning step.
arXiv Detail & Related papers (2024-12-10T12:36:35Z)
SOAR: Improved Indexing for Approximate Nearest Neighbor Search [30.752720306189342]
Spilling with Orthogonality-Amplified Residuals (SOAR) is a novel data indexing technique for approximate nearest neighbor (ANN) search.
arXiv Detail & Related papers (2024-03-31T19:09:09Z)
A Modular Spatial Clustering Algorithm with Noise Specification [0.0]
Bacteria-Farm algorithm is inspired by the growth of bacteria in closed experimental farms. In contrast with other clustering algorithms, our algorithm also has a provision to specify the amount of noise to be excluded during clustering.
arXiv Detail & Related papers (2023-09-18T18:05:06Z)
A density peaks clustering algorithm with sparse search and K-d tree [16.141611031128427]
Density peaks clustering algorithm with sparse search and K-d tree is developed to solve this problem. Experiments are carried out on datasets with different distribution characteristics, by comparing with other five typical clustering algorithms.
arXiv Detail & Related papers (2022-03-02T09:29:40Z)
Mathematical Models for Local Sensing Hashes [7.400475825464313]
We show that approximated index structures offer a good opportunity to accelerate the neighbor search for clustering and outlier detection. We indicate directions to mathematically model the properties of local sensing hashes.
arXiv Detail & Related papers (2021-11-16T10:40:55Z)
Correlation Clustering Reconstruction in Semi-Adversarial Models [70.11015369368272]
Correlation Clustering is an important clustering problem with many applications. We study the reconstruction version of this problem in which one is seeking to reconstruct a latent clustering corrupted by random noise and adversarial modifications.
arXiv Detail & Related papers (2021-08-10T14:46:17Z)
Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing [49.73889315176884]
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning.
arXiv Detail & Related papers (2021-05-18T05:23:53Z)
BR-NS: an Archive-less Approach to Novelty Search [70.13948372218849]
We discuss an alternative approach to novelty estimation, dubbed Behavior Recognition based Novelty Search (BR-NS) BR-NS does not require an archive, makes no assumption on the metrics that can be defined in the behavior space and does not rely on nearest neighbours search. We conduct experiments to gain insight into its feasibility and dynamics as well as potential advantages over archive-based NS in terms of time complexity.
arXiv Detail & Related papers (2021-04-08T17:31:34Z)
Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models. We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z)
Cluster-and-Conquer: When Randomness Meets Graph Locality [0.0]
Some of the most efficient KNN graph algorithms are incremental and local. Cluster-and-Conquer boosts the starting configuration of greedy algorithms.
arXiv Detail & Related papers (2020-10-22T07:31:12Z)
CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON) First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z)
Reinforcing Short-Length Hashing [61.75883795807109]
Existing methods have poor performance in retrieval using an extremely short-length hash code. In this study, we propose a novel reinforcing short-length hashing (RSLH) In this proposed RSLH, mutual reconstruction between the hash representation and semantic labels is performed to preserve the semantic information. Experiments on three large-scale image benchmarks demonstrate the superior performance of RSLH under various short-length hashing scenarios.
arXiv Detail & Related papers (2020-04-24T02:23:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.