Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal
Retriveal
- URL: http://arxiv.org/abs/2305.04239v1
- Date: Sun, 7 May 2023 10:12:14 GMT
- Title: Instance-Variant Loss with Gaussian RBF Kernel for 3D Cross-modal
Retriveal
- Authors: Zhitao Liu, Zengyu Liu, Jiwei Wei, Guan Wang, Zhenjiang Du, Ning Xie,
Heng Tao Shen
- Abstract summary: Existing methods treat all instances equally, applying the same penalty strength to instances with varying degrees of difficulty.
This can result in ambiguous convergence or local optima, severely compromising the separability of the feature space.
We propose an Instance-Variant loss to assign different penalty strengths to different instances, improving the space separability.
- Score: 52.41252219453429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D cross-modal retrieval is gaining attention in the multimedia community.
Central to this topic is learning a joint embedding space to represent data
from different modalities, such as images, 3D point clouds, and polygon meshes,
to extract modality-invariant and discriminative features. Hence, the
performance of cross-modal retrieval methods heavily depends on the
representational capacity of this embedding space. Existing methods treat all
instances equally, applying the same penalty strength to instances with varying
degrees of difficulty, ignoring the differences between instances. This can
result in ambiguous convergence or local optima, severely compromising the
separability of the feature space. To address this limitation, we propose an
Instance-Variant loss to assign different penalty strengths to different
instances, improving the space separability. Specifically, we assign different
penalty weights to instances positively related to their intra-class distance.
Simultaneously, we reduce the cross-modal discrepancy between features by
learning a shared weight vector for the same class data from different
modalities. By leveraging the Gaussian RBF kernel to evaluate sample
similarity, we further propose an Intra-Class loss function that minimizes the
intra-class distance among same-class instances. Extensive experiments on three
3D cross-modal datasets show that our proposed method surpasses recent
state-of-the-art approaches.
Related papers
- Robust Multimodal Learning via Representation Decoupling [6.7678581401558295]
Multimodal learning has attracted increasing attention due to its practicality.
Existing methods tend to address it by learning a common subspace representation for different modality combinations.
We propose a novel Decoupled Multimodal Representation Network (DMRNet) to assist robust multimodal learning.
arXiv Detail & Related papers (2024-07-05T12:09:33Z) - 3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data.
We learn a set of vectors that deform the objects in an adversarial fashion.
We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z) - Deep Metric Learning Assisted by Intra-variance in A Semi-supervised
View of Learning [0.0]
Deep metric learning aims to construct an embedding space where samples of the same class are close to each other, while samples of different classes are far away from each other.
This paper designs a self-supervised generative assisted ranking framework that provides a semi-supervised view of intra-class variance learning scheme for typical supervised deep metric learning.
arXiv Detail & Related papers (2023-04-21T13:30:32Z) - Exploring Modality-shared Appearance Features and Modality-invariant
Relation Features for Cross-modality Person Re-Identification [72.95858515157603]
Cross-modality person re-identification works rely on discriminative modality-shared features.
Despite some initial success, such modality-shared appearance features cannot capture enough modality-invariant information.
A novel cross-modality quadruplet loss is proposed to further reduce the cross-modality variations.
arXiv Detail & Related papers (2021-04-23T11:14:07Z) - Exploring Data Augmentation for Multi-Modality 3D Object Detection [82.9988604088494]
It is counter-intuitive that multi-modality methods based on point cloud and images perform only marginally better or sometimes worse than approaches that solely use point cloud.
We propose a pipeline, named transformation flow, to bridge the gap between single and multi-modality data augmentation with transformation reversing and replaying.
Our method also wins the best PKL award in the 3rd nuScenes detection challenge.
arXiv Detail & Related papers (2020-12-23T15:23:16Z) - Unsupervised Feature Learning by Cross-Level Instance-Group
Discrimination [68.83098015578874]
We integrate between-instance similarity into contrastive learning, not directly by instance grouping, but by cross-level discrimination.
CLD effectively brings unsupervised learning closer to natural data and real-world applications.
New state-of-the-art on self-supervision, semi-supervision, and transfer learning benchmarks, and beats MoCo v2 and SimCLR on every reported performance.
arXiv Detail & Related papers (2020-08-09T21:13:13Z) - Cross-modal Center Loss [28.509817129759014]
Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities.
We propose an approach to jointly train the components of cross-modal retrieval framework with metadata.
The proposed framework significantly outperforms the state-of-the-art methods on the ModelNet40 dataset.
arXiv Detail & Related papers (2020-08-08T17:26:35Z) - The Bures Metric for Generative Adversarial Networks [10.69910379275607]
Generative Adversarial Networks (GANs) are performant generative methods yielding high-quality samples.
We propose to match the real batch diversity to the fake batch diversity.
We observe that diversity matching reduces mode collapse substantially and has a positive effect on the sample quality.
arXiv Detail & Related papers (2020-06-16T12:04:41Z) - Rethinking preventing class-collapsing in metric learning with
margin-based losses [81.22825616879936]
Metric learning seeks embeddings where visually similar instances are close and dissimilar instances are apart.
margin-based losses tend to project all samples of a class onto a single point in the embedding space.
We propose a simple modification to the embedding losses such that each sample selects its nearest same-class counterpart in a batch.
arXiv Detail & Related papers (2020-06-09T09:59:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.