Simultaneously Learning Robust Audio Embeddings and balanced Hash codes
for Query-by-Example
- URL: http://arxiv.org/abs/2211.11060v1
- Date: Sun, 20 Nov 2022 19:22:44 GMT
- Title: Simultaneously Learning Robust Audio Embeddings and balanced Hash codes
for Query-by-Example
- Authors: Anup Singh, Kris Demuynck, Vipul Arora
- Abstract summary: State-of-the-art systems use deep learning to generate compact audio fingerprints.
These systems deploy indexing methods, which quantize fingerprints to hash codes in an unsupervised manner to expedite the search.
We propose a self-supervised learning framework to compute fingerprints and balanced hash codes in an end-to-end manner.
- Score: 8.585546027122808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Audio fingerprinting systems must efficiently and robustly identify query
snippets in an extensive database. To this end, state-of-the-art systems use
deep learning to generate compact audio fingerprints. These systems deploy
indexing methods, which quantize fingerprints to hash codes in an unsupervised
manner to expedite the search. However, these methods generate imbalanced hash
codes, leading to their suboptimal performance. Therefore, we propose a
self-supervised learning framework to compute fingerprints and balanced hash
codes in an end-to-end manner to achieve both fast and accurate retrieval
performance. We model hash codes as a balanced clustering process, which we
regard as an instance of the optimal transport problem. Experimental results
indicate that the proposed approach improves retrieval efficiency while
preserving high accuracy, particularly at high distortion levels, compared to
the competing methods. Moreover, our system is efficient and scalable in
computational load and memory storage.
Related papers
- Noise-Robust Dense Retrieval via Contrastive Alignment Post Training [89.29256833403167]
Contrastive Alignment POst Training (CAPOT) is a highly efficient finetuning method that improves model robustness without requiring index regeneration.
CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root.
We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
arXiv Detail & Related papers (2023-04-06T22:16:53Z) - Cascading Hierarchical Networks with Multi-task Balanced Loss for
Fine-grained hashing [1.6244541005112747]
Fine-grained hashing is more challenging than traditional hashing problems.
We propose a cascaded network to learn compact and highly semantic hash codes.
We also propose a novel approach to coordinately balance the loss of multi-task learning.
arXiv Detail & Related papers (2023-03-20T17:08:48Z) - Unified Functional Hashing in Automatic Machine Learning [58.77232199682271]
We show that large efficiency gains can be obtained by employing a fast unified functional hash.
Our hash is "functional" in that it identifies equivalent candidates even if they were represented or coded differently.
We show dramatic improvements on multiple AutoML domains, including neural architecture search and algorithm discovery.
arXiv Detail & Related papers (2023-02-10T18:50:37Z) - Representation Learning for Efficient and Effective Similarity Search
and Recommendation [6.280255585012339]
This thesis makes contributions to representation learning that improve effectiveness of hash codes through more expressive representations and a more effective similarity measure.
The contributions are empirically validated on several tasks related to similarity search and recommendation.
arXiv Detail & Related papers (2021-09-04T08:19:01Z) - CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON)
First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z) - Deep Hashing with Hash-Consistent Large Margin Proxy Embeddings [65.36757931982469]
Image hash codes are produced by binarizing embeddings of convolutional neural networks (CNN) trained for either classification or retrieval.
The use of a fixed set of proxies (weights of the CNN classification layer) is proposed to eliminate this ambiguity.
The resulting hash-consistent large margin (HCLM) proxies are shown to encourage saturation of hashing units, thus guaranteeing a small binarization error.
arXiv Detail & Related papers (2020-07-27T23:47:43Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z) - Reinforcing Short-Length Hashing [61.75883795807109]
Existing methods have poor performance in retrieval using an extremely short-length hash code.
In this study, we propose a novel reinforcing short-length hashing (RSLH)
In this proposed RSLH, mutual reconstruction between the hash representation and semantic labels is performed to preserve the semantic information.
Experiments on three large-scale image benchmarks demonstrate the superior performance of RSLH under various short-length hashing scenarios.
arXiv Detail & Related papers (2020-04-24T02:23:52Z) - Image Hashing by Minimizing Discrete Component-wise Wasserstein Distance [12.968141477410597]
Adversarial autoencoders are shown to be able to implicitly learn a robust, locality-preserving hash function that generates balanced and high-quality hash codes.
The existing adversarial hashing methods are inefficient to be employed for large-scale image retrieval applications.
We propose a new adversarial-autoencoder hashing approach that has a much lower sample requirement and computational cost.
arXiv Detail & Related papers (2020-02-29T00:22:53Z) - Boosted Locality Sensitive Hashing: Discriminative Binary Codes for
Source Separation [19.72987718461291]
We propose an adaptive boosting approach to learning locality sensitive hash codes, which represent audio spectra efficiently.
We use the learned hash codes for single-channel speech denoising tasks as an alternative to a complex machine learning model.
arXiv Detail & Related papers (2020-02-14T20:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.