Task-adaptive Asymmetric Deep Cross-modal Hashing
- URL: http://arxiv.org/abs/2004.00197v2
- Date: Mon, 21 Mar 2022 08:29:35 GMT
- Title: Task-adaptive Asymmetric Deep Cross-modal Hashing
- Authors: Fengling Li, Tong Wang, Lei Zhu, Zheng Zhang, Xinhua Wang
- Abstract summary: Cross-modal hashing aims to embed semantic correlations of heterogeneous modality data into the binary hash codes with discriminative semantic labels.
We present a Task-adaptive Asymmetric Deep Cross-modal Hashing (TA-ADCMH) method in this paper.
It can learn task-adaptive hash functions for two sub-retrieval tasks via simultaneous modality representation and asymmetric hash learning.
- Score: 20.399984971442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised cross-modal hashing aims to embed the semantic correlations of
heterogeneous modality data into the binary hash codes with discriminative
semantic labels. Because of its advantages on retrieval and storage efficiency,
it is widely used for solving efficient cross-modal retrieval. However,
existing researches equally handle the different tasks of cross-modal
retrieval, and simply learn the same couple of hash functions in a symmetric
way for them. Under such circumstance, the uniqueness of different cross-modal
retrieval tasks are ignored and sub-optimal performance may be brought.
Motivated by this, we present a Task-adaptive Asymmetric Deep Cross-modal
Hashing (TA-ADCMH) method in this paper. It can learn task-adaptive hash
functions for two sub-retrieval tasks via simultaneous modality representation
and asymmetric hash learning. Unlike previous cross-modal hashing approaches,
our learning framework jointly optimizes semantic preserving that transforms
deep features of multimedia data into binary hash codes, and the semantic
regression which directly regresses query modality representation to explicit
label. With our model, the binary codes can effectively preserve semantic
correlations across different modalities, meanwhile, adaptively capture the
query semantics. The superiority of TA-ADCMH is proved on two standard datasets
from many aspects.
Related papers
- GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval [32.06421737874828]
Reconstruction Relations Embedded Hashing (RREH) is designed for semi-paired cross-modal retrieval tasks.
RREH assumes that multi-modal data share a common subspace.
anchors are sampled from paired data, which improves the efficiency of hash learning.
arXiv Detail & Related papers (2024-05-28T03:12:54Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - Unsupervised Contrastive Hashing for Cross-Modal Retrieval in Remote
Sensing [1.6758573326215689]
Cross-modal text-image retrieval has attracted great attention in remote sensing.
We introduce a novel unsupervised cross-modal contrastive hashing (DUCH) method for text-image retrieval in RS.
Experimental results show that the proposed DUCH outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-19T07:25:25Z) - Efficient Cross-Modal Retrieval via Deep Binary Hashing and Quantization [5.799838997511804]
Cross-modal retrieval aims to search for data with similar semantic meanings across different content modalities.
We propose a jointly learned deep hashing and quantization network (HQ) for cross-modal retrieval.
Experimental results on the NUS-WIDE, MIR-Flickr, and Amazon datasets demonstrate that HQ achieves boosts of more than 7% in precision.
arXiv Detail & Related papers (2022-02-15T22:00:04Z) - Multi-Modal Mutual Information Maximization: A Novel Approach for
Unsupervised Deep Cross-Modal Hashing [73.29587731448345]
We propose a novel method, dubbed Cross-Modal Info-Max Hashing (CMIMH)
We learn informative representations that can preserve both intra- and inter-modal similarities.
The proposed method consistently outperforms other state-of-the-art cross-modal retrieval methods.
arXiv Detail & Related papers (2021-12-13T08:58:03Z) - FDDH: Fast Discriminative Discrete Hashing for Large-Scale Cross-Modal
Retrieval [41.125141897096874]
Cross-modal hashing is favored for its effectiveness and efficiency.
Most existing methods do not sufficiently exploit the discriminative power of semantic information when learning the hash codes.
We propose Fast Discriminative Discrete Hashing (FDDH) approach for large-scale cross-modal retrieval.
arXiv Detail & Related papers (2021-05-15T03:53:48Z) - CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON)
First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z) - Unsupervised Deep Cross-modality Spectral Hashing [65.3842441716661]
The framework is a two-step hashing approach which decouples the optimization into binary optimization and hashing function learning.
We propose a novel spectral embedding-based algorithm to simultaneously learn single-modality and binary cross-modality representations.
We leverage the powerful CNN for images and propose a CNN-based deep architecture to learn text modality.
arXiv Detail & Related papers (2020-08-01T09:20:11Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z) - Deep Robust Multilevel Semantic Cross-Modal Hashing [25.895586911858857]
Hashing based cross-modal retrieval has recently made significant progress.
But straightforward embedding data from different modalities into a joint Hamming space will inevitably produce false codes.
We present a novel Robust Multilevel Semantic Hashing (RMSH) for more accurate cross-modal retrieval.
arXiv Detail & Related papers (2020-02-07T10:08:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.