Robust and Decomposable Average Precision for Image Retrieval
- URL: http://arxiv.org/abs/2110.01445v1
- Date: Fri, 1 Oct 2021 12:00:43 GMT
- Title: Robust and Decomposable Average Precision for Image Retrieval
- Authors: Elias Ramzi (CNAM, CEDRIC - VERTIGO), Nicolas Thome (CNAM, CEDRIC -
VERTIGO), Cl\'ement Rambour (CNAM, CEDRIC - VERTIGO), Nicolas Audebert (CNAM,
CEDRIC - VERTIGO), Xavier Bitot
- Abstract summary: In image retrieval, standard evaluation metrics rely on score ranking, e.g. average precision (AP)
In this paper, we introduce a method for robust and decomposable average precision (ROADMAP)
We address two major challenges for end-to-end training of deep neural networks with AP: non-differentiability and non-decomposability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In image retrieval, standard evaluation metrics rely on score ranking, e.g.
average precision (AP). In this paper, we introduce a method for robust and
decomposable average precision (ROADMAP) addressing two major challenges for
end-to-end training of deep neural networks with AP: non-differentiability and
non-decomposability. Firstly, we propose a new differentiable approximation of
the rank function, which provides an upper bound of the AP loss and ensures
robust training. Secondly, we design a simple yet effective loss function to
reduce the decomposability gap between the AP in the whole training set and its
averaged batch approximation, for which we provide theoretical guarantees.
Extensive experiments conducted on three image retrieval datasets show that
ROADMAP outperforms several recent AP approximation methods and highlight the
importance of our two contributions. Finally, using ROADMAP for training deep
models yields very good performances, outperforming state-of-the-art results on
the three datasets.
Related papers
- Not All Pairs are Equal: Hierarchical Learning for Average-Precision-Oriented Video Retrieval [80.09819072780193]
Average Precision (AP) assesses the overall rankings of relevant videos at the top list.
Recent video retrieval methods utilize pair-wise losses that treat all sample pairs equally.
arXiv Detail & Related papers (2024-07-22T11:52:04Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - SuSana Distancia is all you need: Enforcing class separability in metric
learning via two novel distance-based loss functions for few-shot image
classification [0.9236074230806579]
We propose two loss functions which consider the importance of the embedding vectors by looking at the intra-class and inter-class distance between the few data.
Our results show a significant improvement in accuracy in the miniImagenNet benchmark compared to other metric-based few-shot learning methods by a margin of 2%.
arXiv Detail & Related papers (2023-05-15T23:12:09Z) - Hierarchical Average Precision Training for Pertinent Image Retrieval [0.0]
This paper introduces a new hierarchical AP training method for pertinent image retrieval (HAP-PIER)
HAP-PIER is based on a new H-AP metric, which integrates errors' importance and better evaluate rankings.
Experiments on 6 datasets show that HAPPIER significantly outperforms state-of-the-art methods for hierarchical retrieval.
arXiv Detail & Related papers (2022-07-05T07:55:18Z) - Improving Point Cloud Based Place Recognition with Ranking-based Loss
and Large Batch Training [1.116812194101501]
The paper presents a simple and effective learning-based method for computing a discriminative 3D point cloud descriptor.
We employ recent advances in image retrieval and propose a modified version of a loss function based on a differentiable average precision approximation.
arXiv Detail & Related papers (2022-03-02T09:29:28Z) - Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation [87.54604263202941]
We propose a tiny deep neural network of which partial layers are iteratively exploited for refining its previous estimations.
We employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model.
Our method consistently outperforms state-of-the-art 2D/3D hand pose estimation approaches in terms of both accuracy and efficiency for widely used benchmarks.
arXiv Detail & Related papers (2021-11-11T23:31:34Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up
Human Pose Estimation [81.03485688525133]
We propose a novel multi-person pose estimation framework, SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation (SIMPLE)
Specifically, in the training process, we enable SIMPLE to mimic the pose knowledge from the high-performance top-down pipeline.
Besides, SIMPLE formulates human detection and pose estimation as a unified point learning framework to complement each other in single-network.
arXiv Detail & Related papers (2021-04-06T13:12:51Z) - Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval [94.73459295405507]
Smooth-AP is a plug-and-play objective function that allows for end-to-end training of deep networks.
We apply Smooth-AP to standard retrieval benchmarks: Stanford Online products and VehicleID.
We also evaluate on larger-scale datasets: INaturalist for fine-grained category retrieval, VGGFace2 and IJB-C for face retrieval.
arXiv Detail & Related papers (2020-07-23T17:52:03Z) - Hierarchical and Efficient Learning for Person Re-Identification [19.172946887940874]
We propose a novel Hierarchical and Efficient Network (HENet) that learns hierarchical global, partial, and recovery features ensemble under the supervision of multiple loss combinations.
We also propose a new dataset augmentation approach, dubbed Random Polygon Erasing (RPE), to random erase irregular area of the input image for imitating the body part missing.
arXiv Detail & Related papers (2020-05-18T15:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.