Discriminative Supervised Subspace Learning for Cross-modal Retrieval
- URL: http://arxiv.org/abs/2201.11843v1
- Date: Wed, 26 Jan 2022 14:27:39 GMT
- Title: Discriminative Supervised Subspace Learning for Cross-modal Retrieval
- Authors: Haoming Zhang, Xiao-Jun Wu, Tianyang Xu and Donglin Zhang
- Abstract summary: We propose a discriminative supervised subspace learning for cross-modal retrieval(DS2L)
Specifically, we first construct a shared semantic graph to preserve the semantic structure within each modality.
We then introduce the Hilbert-Schmidt Independence Criterion(HSIC) to preserve the consistence between feature-similarity and semantic-similarity of samples.
- Score: 16.035973055257642
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Nowadays the measure between heterogeneous data is still an open problem for
cross-modal retrieval. The core of cross-modal retrieval is how to measure the
similarity between different types of data. Many approaches have been developed
to solve the problem. As one of the mainstream, approaches based on subspace
learning pay attention to learning a common subspace where the similarity among
multi-modal data can be measured directly. However, many of the existing
approaches only focus on learning a latent subspace. They ignore the full use
of discriminative information so that the semantically structural information
is not well preserved. Therefore satisfactory results can not be achieved as
expected. We in this paper propose a discriminative supervised subspace
learning for cross-modal retrieval(DS2L), to make full use of discriminative
information and better preserve the semantically structural information.
Specifically, we first construct a shared semantic graph to preserve the
semantic structure within each modality. Subsequently, the Hilbert-Schmidt
Independence Criterion(HSIC) is introduced to preserve the consistence between
feature-similarity and semantic-similarity of samples. Thirdly, we introduce a
similarity preservation term, thus our model can compensate for the
shortcomings of insufficient use of discriminative data and better preserve the
semantically structural information within each modality. The experimental
results obtained on three well-known benchmark datasets demonstrate the
effectiveness and competitiveness of the proposed method against the compared
classic subspace learning approaches.
Related papers
- Comparing the information content of probabilistic representation spaces [3.7277730514654555]
Probabilistic representation spaces convey information about a dataset, and to understand the effects of factors such as training loss and network architecture, we seek to compare the information content of such spaces.
Here, instead of building upon point-based measures of comparison, we build upon classic methods from literature on hard clustering.
We propose a practical method of estimation that is based on fingerprinting a representation space with a sample of the dataset and is applicable when the communicated information is only a handful of bits.
arXiv Detail & Related papers (2024-05-31T17:33:07Z) - Separating common from salient patterns with Contrastive Representation
Learning [2.250968907999846]
Contrastive Analysis aims at separating common factors of variation between two datasets.
Current models based on Variational Auto-Encoders have shown poor performance in learning semantically-expressive representations.
We propose to leverage the ability of Contrastive Learning to learn semantically expressive representations well adapted for Contrastive Analysis.
arXiv Detail & Related papers (2024-02-19T08:17:13Z) - Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space.
However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts.
We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z) - Generalizable Heterogeneous Federated Cross-Correlation and Instance
Similarity Learning [60.058083574671834]
This paper presents a novel FCCL+, federated correlation and similarity learning with non-target distillation.
For heterogeneous issue, we leverage irrelevant unlabeled public data for communication.
For catastrophic forgetting in local updating stage, FCCL+ introduces Federated Non Target Distillation.
arXiv Detail & Related papers (2023-09-28T09:32:27Z) - DCID: Deep Canonical Information Decomposition [84.59396326810085]
We consider the problem of identifying the signal shared between two one-dimensional target variables.
We propose ICM, an evaluation metric which can be used in the presence of ground-truth labels.
We also propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables.
arXiv Detail & Related papers (2023-06-27T16:59:06Z) - Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z) - Voxel-wise Adversarial Semi-supervised Learning for Medical Image
Segmentation [4.489713477369384]
We introduce a novel adversarial learning-based semi-supervised segmentation method for medical image segmentation.
Our method embeds both local and global features from multiple hidden layers and learns context relations between multiple classes.
Our method outperforms current best-performing state-of-the-art semi-supervised learning approaches on the image segmentation of the left atrium (single class) and multiorgan datasets (multiclass)
arXiv Detail & Related papers (2022-05-14T06:57:19Z) - Learning Conditional Invariance through Cycle Consistency [60.85059977904014]
We propose a novel approach to identify meaningful and independent factors of variation in a dataset.
Our method involves two separate latent subspaces for the target property and the remaining input information.
We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models.
arXiv Detail & Related papers (2021-11-25T17:33:12Z) - Generalized One-Class Learning Using Pairs of Complementary Classifiers [41.64645294104883]
One-class learning is the classic problem of fitting a model to the data for which annotations are available only for a single class.
In this paper, we explore novel objectives for one-class learning, which we collectively refer to as Generalized One-class Discriminative Subspaces (GODS)
arXiv Detail & Related papers (2021-06-24T18:52:05Z) - Integrating Information Theory and Adversarial Learning for Cross-modal
Retrieval [19.600581093189362]
Accurately matching visual and textual data in cross-modal retrieval has been widely studied in the multimedia community.
We propose integrating Shannon information theory and adversarial learning.
In terms of the gap, we integrate modality classification and information entropy adversarially.
arXiv Detail & Related papers (2021-04-11T11:04:55Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.