Recall@k Surrogate Loss with Large Batches and Similarity Mixup
- URL: http://arxiv.org/abs/2108.11179v1
- Date: Wed, 25 Aug 2021 11:09:11 GMT
- Title: Recall@k Surrogate Loss with Large Batches and Similarity Mixup
- Authors: Yash Patel, Giorgos Tolias, Jiri Matas
- Abstract summary: Direct optimization, by gradient descent, of an evaluation metric is not possible when it is non-differentiable.
In this work, a differentiable surrogate loss for the recall is proposed.
The proposed method achieves state-of-the-art results in several image retrieval benchmarks.
- Score: 62.67458021725227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Direct optimization, by gradient descent, of an evaluation metric, is not
possible when it is non-differentiable, which is the case for recall in
retrieval. In this work, a differentiable surrogate loss for the recall is
proposed. Using an implementation that sidesteps the hardware constraints of
the GPU memory, the method trains with a very large batch size, which is
essential for metrics computed on the entire retrieval database. It is assisted
by an efficient mixup approach that operates on pairwise scalar similarities
and virtually increases the batch size further. When used for deep metric
learning, the proposed method achieves state-of-the-art results in several
image retrieval benchmarks. For instance-level recognition, the method
outperforms similar approaches that train using an approximation of average
precision. The implementation will be made public.
Related papers
- Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss [59.835032408496545]
We propose a tile-based strategy that partitions the contrastive loss calculation into arbitrary small blocks.
We also introduce a multi-level tiling strategy to leverage the hierarchical structure of distributed systems.
Compared to SOTA memory-efficient solutions, it achieves a two-order-of-magnitude reduction in memory while maintaining comparable speed.
arXiv Detail & Related papers (2024-10-22T17:59:30Z) - Efficient Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities.
We show that Mixture-of-Logits (MoL) is a universal approximator, and can express all learned similarity functions.
MoL sets new state-of-the-art results on recommendation retrieval tasks, and our approximate top-k retrieval with learned similarities outperforms baselines by up to two orders of magnitude in latency.
arXiv Detail & Related papers (2024-07-22T08:19:34Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - Improving Point Cloud Based Place Recognition with Ranking-based Loss
and Large Batch Training [1.116812194101501]
The paper presents a simple and effective learning-based method for computing a discriminative 3D point cloud descriptor.
We employ recent advances in image retrieval and propose a modified version of a loss function based on a differentiable average precision approximation.
arXiv Detail & Related papers (2022-03-02T09:29:28Z) - Relational Surrogate Loss Learning [41.61184221367546]
This paper revisits the surrogate loss learning, where a deep neural network is employed to approximate the evaluation metrics.
In this paper, we show that directly maintaining the relation of models between surrogate losses and metrics suffices.
Our method is much easier to optimize and enjoys significant efficiency and performance gains.
arXiv Detail & Related papers (2022-02-26T17:32:57Z) - Learning to Perform Downlink Channel Estimation in Massive MIMO Systems [72.76968022465469]
We study downlink (DL) channel estimation in a Massive multiple-input multiple-output (MIMO) system.
A common approach is to use the mean value as the estimate, motivated by channel hardening.
We propose two novel estimation methods.
arXiv Detail & Related papers (2021-09-06T13:42:32Z) - A Unified Framework of Surrogate Loss by Refactoring and Interpolation [65.60014616444623]
We introduce UniLoss, a unified framework to generate surrogate losses for training deep networks with gradient descent.
We validate the effectiveness of UniLoss on three tasks and four datasets.
arXiv Detail & Related papers (2020-07-27T21:16:51Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.