Asymmetric Scalable Cross-modal Hashing
- URL: http://arxiv.org/abs/2207.12650v1
- Date: Tue, 26 Jul 2022 04:38:47 GMT
- Title: Asymmetric Scalable Cross-modal Hashing
- Authors: Wenyun Li, Chi-Man Pun
- Abstract summary: Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
- Score: 51.309905690367835
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Cross-modal hashing is a successful method to solve large-scale multimedia
retrieval issue. A lot of matrix factorization-based hashing methods are
proposed. However, the existing methods still struggle with a few problems,
such as how to generate the binary codes efficiently rather than directly relax
them to continuity. In addition, most of the existing methods choose to use an
$n\times n$ similarity matrix for optimization, which makes the memory and
computation unaffordable. In this paper we propose a novel Asymmetric Scalable
Cross-Modal Hashing (ASCMH) to address these issues. It firstly introduces a
collective matrix factorization to learn a common latent space from the
kernelized features of different modalities, and then transforms the similarity
matrix optimization to a distance-distance difference problem minimization with
the help of semantic labels and common latent space. Hence, the computational
complexity of the $n\times n$ asymmetric optimization is relieved. In the
generation of hash codes we also employ an orthogonal constraint of label
information, which is indispensable for search accuracy. So the redundancy of
computation can be much reduced. For efficient optimization and scalable to
large-scale datasets, we adopt the two-step approach rather than optimizing
simultaneously. Extensive experiments on three benchmark datasets: Wiki,
MIRFlickr-25K, and NUS-WIDE, demonstrate that our ASCMH outperforms the
state-of-the-art cross-modal hashing methods in terms of accuracy and
efficiency.
Related papers
- An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Fast Differentiable Matrix Square Root and Inverse Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root and the inverse square root.
For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad'e Approximants (MPA)
A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration.
arXiv Detail & Related papers (2022-01-29T10:00:35Z) - FDDH: Fast Discriminative Discrete Hashing for Large-Scale Cross-Modal
Retrieval [41.125141897096874]
Cross-modal hashing is favored for its effectiveness and efficiency.
Most existing methods do not sufficiently exploit the discriminative power of semantic information when learning the hash codes.
We propose Fast Discriminative Discrete Hashing (FDDH) approach for large-scale cross-modal retrieval.
arXiv Detail & Related papers (2021-05-15T03:53:48Z) - A non-alternating graph hashing algorithm for large scale image search [5.221613241320111]
We propose a novel relaxed formulation for spectral hashing that adds no additional variables to the problem.
Instead of solving the problem in original space where number of variables is equal to the data points, we solve the problem in a much smaller space.
This trick reduces both the memory and computational complexity at the same time.
arXiv Detail & Related papers (2020-12-24T06:41:54Z) - CIMON: Towards High-quality Hash Codes [63.37321228830102]
We propose a new method named textbfComprehensive stextbfImilarity textbfMining and ctextbfOnsistency leartextbfNing (CIMON)
First, we use global refinement and similarity statistical distribution to obtain reliable and smooth guidance. Second, both semantic and contrastive consistency learning are introduced to derive both disturb-invariant and discriminative hash codes.
arXiv Detail & Related papers (2020-10-15T14:47:14Z) - Revisiting Co-Occurring Directions: Sharper Analysis and Efficient
Algorithm for Sparse Matrices [23.22254890452548]
We study the streaming model for approximate matrix multiplication (AMM)
We are interested in the scenario that the algorithm can only take one pass over the data with limited memory.
The state-of-the-art deterministic sketching algorithm for streaming AMM is the co-occurring directions (COD)
arXiv Detail & Related papers (2020-09-05T15:35:59Z) - Task-adaptive Asymmetric Deep Cross-modal Hashing [20.399984971442]
Cross-modal hashing aims to embed semantic correlations of heterogeneous modality data into the binary hash codes with discriminative semantic labels.
We present a Task-adaptive Asymmetric Deep Cross-modal Hashing (TA-ADCMH) method in this paper.
It can learn task-adaptive hash functions for two sub-retrieval tasks via simultaneous modality representation and asymmetric hash learning.
arXiv Detail & Related papers (2020-04-01T02:09:20Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z) - Image Hashing by Minimizing Discrete Component-wise Wasserstein Distance [12.968141477410597]
Adversarial autoencoders are shown to be able to implicitly learn a robust, locality-preserving hash function that generates balanced and high-quality hash codes.
The existing adversarial hashing methods are inefficient to be employed for large-scale image retrieval applications.
We propose a new adversarial-autoencoder hashing approach that has a much lower sample requirement and computational cost.
arXiv Detail & Related papers (2020-02-29T00:22:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.