Related papers: A Novel Triplet Sampling Method for Multi-Label Remote Sensing Image Search and Retrieval

A Novel Triplet Sampling Method for Multi-Label Remote Sensing Image Search and Retrieval

URL: http://arxiv.org/abs/2105.03647v1
Date: Sat, 8 May 2021 09:16:09 GMT
Title: A Novel Triplet Sampling Method for Multi-Label Remote Sensing Image Search and Retrieval
Authors: Tristan Kreuziger, Mahdyar Ravanbakhsh, Beg\"um Demir
Abstract summary: A common approach for learning the metric space relies on the selection of triplets of similar (positive) and dissimilar (negative) images. We propose a novel triplet sampling method in the framework of deep neural networks (DNNs) defined for multi-label RS CBIR problems.
Score: 1.123376893295777
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning the similarity between remote sensing (RS) images forms the foundation for content based RS image retrieval (CBIR). Recently, deep metric learning approaches that map the semantic similarity of images into an embedding space have been found very popular in RS. A common approach for learning the metric space relies on the selection of triplets of similar (positive) and dissimilar (negative) images to a reference image called as an anchor. Choosing triplets is a difficult task particularly for multi-label RS CBIR, where each training image is annotated by multiple class labels. To address this problem, in this paper we propose a novel triplet sampling method in the framework of deep neural networks (DNNs) defined for multi-label RS CBIR problems. The proposed method selects a small set of the most representative and informative triplets based on two main steps. In the first step, a set of anchors that are diverse to each other in the embedding space is selected from the current mini-batch using an iterative algorithm. In the second step, different sets of positive and negative images are chosen for each anchor by evaluating relevancy, hardness, and diversity of the images among each other based on a novel ranking strategy. Experimental results obtained on two multi-label benchmark achieves show that the selection of the most informative and representative triplets in the context of DNNs results in: i) reducing the computational complexity of the training phase of the DNNs without any significant loss on the performance; and ii) an increase in learning speed since informative triplets allow fast convergence. The code of the proposed method is publicly available at https://git.tu-berlin.de/rsim/image-retrieval-from-triplets.

Related papers

Annotation Cost-Efficient Active Learning for Deep Metric Learning Driven Remote Sensing Image Retrieval [3.2109665109975696]
ANNEAL aims to create a small but informative training set made up of similar and dissimilar image pairs. The informativeness of image pairs is evaluated by combining uncertainty and diversity criteria. This way of annotating images significantly reduces the annotation cost compared to annotating images with land-use land-cover class labels.
arXiv Detail & Related papers (2024-06-14T15:08:04Z)
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval [50.72924579220149]
Composed Image Retrieval (CIR) is a task that retrieves images similar to a query, based on a provided textual modification. Current techniques rely on supervised learning for CIR models using labeled triplets of the reference image, text, target image. We propose a new semi-supervised CIR approach where we search for a reference and its related target images in auxiliary data.
arXiv Detail & Related papers (2024-04-23T21:00:22Z)
Advancing Image Retrieval with Few-Shot Learning and Relevance Feedback [5.770351255180495]
Image Retrieval with Relevance Feedback (IRRF) involves iterative human interaction during the retrieval process. We propose a new scheme based on a hyper-network, that is tailored to the task and facilitates swift adjustment to user feedback. We show that our method can attain SoTA results in few-shot one-class classification and reach comparable results in binary classification task of few-shot open-set recognition.
arXiv Detail & Related papers (2023-12-18T10:20:28Z)
Ranking-aware Uncertainty for Text-guided Image Retrieval [17.70430913227593]
We propose a novel ranking-aware uncertainty approach to model many-to-many correspondences. Compared to the existing state-of-the-art methods, our proposed method achieves significant results on two public datasets.
arXiv Detail & Related papers (2023-08-16T03:48:19Z)
A Triplet-loss Dilated Residual Network for High-Resolution Representation Learning in Image Retrieval [0.0]
In some applications, such as localization, image retrieval is employed as the initial step. The current paper introduces a simple yet efficient image retrieval system with a fewer trainable parameters. The proposed method benefits from a dilated residual convolutional neural network with triplet loss.
arXiv Detail & Related papers (2023-03-15T07:01:44Z)
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval [84.11127588805138]
Composed Image Retrieval (CIR) combines a query image with text to describe their intended target. Existing methods rely on supervised learning of CIR models using labeled triplets consisting of the query image, text specification, and the target image. We propose Zero-Shot Composed Image Retrieval (ZS-CIR), whose goal is to build a CIR model without requiring labeled triplets for training.
arXiv Detail & Related papers (2023-02-06T19:40:04Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
Learning Self-Supervised Low-Rank Network for Single-Stage Weakly and Semi-Supervised Semantic Segmentation [119.009033745244]
This paper presents a Self-supervised Low-Rank Network ( SLRNet) for single-stage weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS) SLRNet uses cross-view self-supervision, that is, it simultaneously predicts several attentive LR representations from different views of an image to learn precise pseudo-labels. Experiments on the Pascal VOC 2012, COCO, and L2ID datasets demonstrate that our SLRNet outperforms both state-of-the-art WSSS and SSSS methods with a variety of different settings.
arXiv Detail & Related papers (2022-03-19T09:19:55Z)
Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search [90.30623718137244]
We propose a novel deep hashing method for scalable multi-label image search. A new rank-consistency objective is applied to align the similarity orders from two spaces. A powerful loss function is designed to penalize the samples whose semantic similarity and hamming distance are mismatched.
arXiv Detail & Related papers (2021-02-02T13:46:58Z)
Learning to Focus: Cascaded Feature Matching Network for Few-shot Image Recognition [38.49419948988415]
Deep networks can learn to accurately recognize objects of a category by training on a large number of images. A meta-learning challenge known as a low-shot image recognition task comes when only a few images with annotations are available for learning a recognition model for one category. Our method, called Cascaded Feature Matching Network (CFMN), is proposed to solve this problem. Experiments for few-shot learning on two standard datasets, emphminiImageNet and Omniglot, have confirmed the effectiveness of our method.
arXiv Detail & Related papers (2021-01-13T11:37:28Z)
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.