Farewell to Mutual Information: Variational Distillation for Cross-Modal
Person Re-Identification
- URL: http://arxiv.org/abs/2104.02862v1
- Date: Wed, 7 Apr 2021 02:19:41 GMT
- Title: Farewell to Mutual Information: Variational Distillation for Cross-Modal
Person Re-Identification
- Authors: Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu, Yuan Xie,
Lizhuang Ma
- Abstract summary: The Information Bottleneck (IB) provides an information theoretic principle for representation learning.
We present a new strategy, Variational Self-Distillation (VSD), which provides a scalable, flexible and analytic solution.
We also introduce two other strategies, Variational Cross-Distillation (VCD) and Variational Mutual-Learning (VML)
- Score: 41.02729491273057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Information Bottleneck (IB) provides an information theoretic principle
for representation learning, by retaining all information relevant for
predicting label while minimizing the redundancy. Though IB principle has been
applied to a wide range of applications, its optimization remains a challenging
problem which heavily relies on the accurate estimation of mutual information.
In this paper, we present a new strategy, Variational Self-Distillation (VSD),
which provides a scalable, flexible and analytic solution to essentially
fitting the mutual information but without explicitly estimating it. Under
rigorously theoretical guarantee, VSD enables the IB to grasp the intrinsic
correlation between representation and label for supervised training.
Furthermore, by extending VSD to multi-view learning, we introduce two other
strategies, Variational Cross-Distillation (VCD) and Variational
Mutual-Learning (VML), which significantly improve the robustness of
representation to view-changes by eliminating view-specific and task-irrelevant
information. To verify our theoretically grounded strategies, we apply our
approaches to cross-modal person Re-ID, and conduct extensive experiments,
where the superior performance against state-of-the-art methods are
demonstrated. Our intriguing findings highlight the need to rethink the way to
estimate mutual
Related papers
- Constrained Multiview Representation for Self-supervised Contrastive
Learning [4.817827522417457]
We introduce a novel approach predicated on representation distance-based mutual information (MI) for measuring the significance of different views.
We harness multi-view representations extracted from the frequency domain, re-evaluating their significance based on mutual information.
arXiv Detail & Related papers (2024-02-05T19:09:33Z) - Elastic Information Bottleneck [34.90040361806197]
Information bottleneck is an information-theoretic principle of representation learning.
We propose an elastic information bottleneck (EIB) to interpolate between the IB and DIB regularizers.
simulations and real data experiments show that EIB has the ability to achieve better domain adaptation results than IB and DIB.
arXiv Detail & Related papers (2023-11-07T12:53:55Z) - R\'enyiCL: Contrastive Representation Learning with Skew R\'enyi
Divergence [78.15455360335925]
We present a new robust contrastive learning scheme, coined R'enyiCL, which can effectively manage harder augmentations.
Our method is built upon the variational lower bound of R'enyi divergence.
We show that R'enyi contrastive learning objectives perform innate hard negative sampling and easy positive sampling simultaneously.
arXiv Detail & Related papers (2022-08-12T13:37:05Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Multi-Modal Mutual Information Maximization: A Novel Approach for
Unsupervised Deep Cross-Modal Hashing [73.29587731448345]
We propose a novel method, dubbed Cross-Modal Info-Max Hashing (CMIMH)
We learn informative representations that can preserve both intra- and inter-modal similarities.
The proposed method consistently outperforms other state-of-the-art cross-modal retrieval methods.
arXiv Detail & Related papers (2021-12-13T08:58:03Z) - Which Mutual-Information Representation Learning Objectives are
Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data.
This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy.
Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.