Experimental Analysis of Machine Learning Techniques for Finding Search
Radius in Locality Sensitive Hashing
- URL: http://arxiv.org/abs/2211.09093v1
- Date: Wed, 16 Nov 2022 18:19:10 GMT
- Title: Experimental Analysis of Machine Learning Techniques for Finding Search
Radius in Locality Sensitive Hashing
- Authors: Omid Jafari and Parth Nagarkar
- Abstract summary: Locality Sensitive Hashing (LSH) is one of the most popular approximate nearest neighbor search techniques for high-dimensional spaces.
An improved LSH-based index structure, called radius-optimized Locality Sensitive Hashing (roLSH) has been proposed to utilize Machine Learning.
- Score: 0.9137554315375919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding similar data in high-dimensional spaces is one of the important tasks
in multimedia applications. Approaches introduced to find exact searching
techniques often use tree-based index structures which are known to suffer from
the curse of the dimensionality problem that limits their performance.
Approximate searching techniques prefer performance over accuracy and they
return good enough results while achieving a better performance. Locality
Sensitive Hashing (LSH) is one of the most popular approximate nearest neighbor
search techniques for high-dimensional spaces. One of the most time-consuming
processes in LSH is to find the neighboring points in the projected spaces. An
improved LSH-based index structure, called radius-optimized Locality Sensitive
Hashing (roLSH) has been proposed to utilize Machine Learning and efficiently
find these neighboring points; thus, further improve the overall performance of
LSH. In this paper, we extend roLSH by experimentally studying the effect of
different types of famous Machine Learning techniques on overall performance.
We compare ten regression techniques on four real-world datasets and show that
Neural Network-based techniques are the best fit to be used in roLSH as their
accuracy and performance trade-off are the best compared to the other
techniques.
Related papers
- Early Exit Strategies for Approximate k-NN Search in Dense Retrieval [10.48678957367324]
We build upon state-of-the-art for early exit A-kNN and propose an unsupervised method based on the notion of patience.
We show that our techniques improve the A-kNN efficiency with up to 5x speedups while achieving negligible effectiveness losses.
arXiv Detail & Related papers (2024-08-09T10:17:07Z) - Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities.
We show that Mixture-of-Logits (MoL) can be realized empirically to achieve superior performance on diverse retrieval scenarios.
arXiv Detail & Related papers (2024-07-22T08:19:34Z) - ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate
Nearest Neighbor Search Algorithms [5.478671305092084]
We introduce ParlayANN, a library of deterministic and parallel graph-based approximate nearest neighbor search algorithms.
We develop novel parallel implementations for four state-of-the-art graph-based ANNS algorithms that scale to billion-scale datasets.
arXiv Detail & Related papers (2023-05-07T19:28:23Z) - Constructing Tree-based Index for Efficient and Effective Dense
Retrieval [26.706985694158384]
JTR stands for Joint optimization of TRee-based index and query encoding.
We design a new unified contrastive learning loss to train tree-based index and query encoder in an end-to-end manner.
Experimental results show that JTR achieves better retrieval performance while retaining high system efficiency.
arXiv Detail & Related papers (2023-04-24T09:25:39Z) - MLGWSC-1: The first Machine Learning Gravitational-Wave Search Mock Data
Challenge [110.7678032481059]
We present the results of the first Machine Learning Gravitational-Wave Search Mock Data Challenge (MLGWSC-1).
For this challenge, participating groups had to identify gravitational-wave signals from binary black hole mergers of increasing complexity and duration embedded in progressively more realistic noise.
Our results show that current machine learning search algorithms may already be sensitive enough in limited parameter regions to be useful for some production settings.
arXiv Detail & Related papers (2022-09-22T16:44:59Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - Web image search engine based on LSH index and CNN Resnet50 [0.0]
We adopt the Locality Sensitive Hashing (LSH) index to implement a CBIR system that allows us to perform fast similarity search on deep features.
Specifically, we exploit transfer learning techniques to extract deep features from images.
We then try out several fully connected deep neural networks, built on top of both of the previously mentioned CNNs.
arXiv Detail & Related papers (2021-08-20T14:43:41Z) - Learning to Hash Robustly, with Guarantees [79.68057056103014]
In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms.
We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically.
Our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets.
arXiv Detail & Related papers (2021-08-11T20:21:30Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - Off-Policy Reinforcement Learning for Efficient and Effective GAN
Architecture Search [50.40004966087121]
We introduce a new reinforcement learning based neural architecture search (NAS) methodology for generative adversarial network (GAN) architecture search.
The key idea is to formulate the GAN architecture search problem as a Markov decision process (MDP) for smoother architecture sampling.
We exploit an off-policy GAN architecture search algorithm that makes efficient use of the samples generated by previous policies.
arXiv Detail & Related papers (2020-07-17T18:29:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.