Rethinking Minimal Sufficient Representation in Contrastive Learning
- URL: http://arxiv.org/abs/2203.07004v1
- Date: Mon, 14 Mar 2022 11:17:48 GMT
- Title: Rethinking Minimal Sufficient Representation in Contrastive Learning
- Authors: Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu
- Abstract summary: We show that contrastive learning models have the risk of over-fitting to the shared information between views.
We propose to increase the mutual information between the representation and input as regularization to approximately introduce more task-relevant information.
It significantly improves the performance of several classic contrastive learning models in downstream tasks.
- Score: 28.83450836832452
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning between different views of the data achieves outstanding
success in the field of self-supervised representation learning and the learned
representations are useful in broad downstream tasks. Since all supervision
information for one view comes from the other view, contrastive learning
approximately obtains the minimal sufficient representation which contains the
shared information and eliminates the non-shared information between views.
Considering the diversity of the downstream tasks, it cannot be guaranteed that
all task-relevant information is shared between views. Therefore, we assume the
non-shared task-relevant information cannot be ignored and theoretically prove
that the minimal sufficient representation in contrastive learning is not
sufficient for the downstream tasks, which causes performance degradation. This
reveals a new problem that the contrastive learning models have the risk of
over-fitting to the shared information between views. To alleviate this
problem, we propose to increase the mutual information between the
representation and input as regularization to approximately introduce more
task-relevant information, since we cannot utilize any downstream task
information during training. Extensive experiments verify the rationality of
our analysis and the effectiveness of our method. It significantly improves the
performance of several classic contrastive learning models in downstream tasks.
Our code is available at \url{https://github.com/Haoqing-Wang/InfoCL}.
Related papers
- Multimodal Information Bottleneck for Deep Reinforcement Learning with Multiple Sensors [10.454194186065195]
Reinforcement learning has achieved promising results on robotic control tasks but struggles to leverage information effectively.
Recent works construct auxiliary losses based on reconstruction or mutual information to extract joint representations from multiple sensory inputs.
We argue that compressing information in the learned joint representations about raw multimodal observations is helpful.
arXiv Detail & Related papers (2024-10-23T04:32:37Z) - Leveraging Superfluous Information in Contrastive Representation Learning [0.0]
We show that superfluous information does exist during the conventional contrastive learning framework.
We design a new objective, namely SuperInfo, to learn robust representations by a linear combination of both predictive and superfluous information.
We demonstrate that learning with our loss can often outperform the traditional contrastive learning approaches on image classification, object detection and instance segmentation tasks.
arXiv Detail & Related papers (2024-08-19T16:21:08Z) - MVEB: Self-Supervised Learning with Multi-View Entropy Bottleneck [53.44358636312935]
Self-supervised approaches regard two views of an image as both the input and the self-supervised signals.
Recent studies show that discarding superfluous information not shared between the views can improve generalization.
We propose an objective multi-view entropy bottleneck (MVEB) to learn minimal sufficient representation effectively.
arXiv Detail & Related papers (2024-03-28T00:50:02Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Factorized Contrastive Learning: Going Beyond Multi-view Redundancy [116.25342513407173]
This paper proposes FactorCL, a new multimodal representation learning method to go beyond multi-view redundancy.
On large-scale real-world datasets, FactorCL captures both shared and unique information and achieves state-of-the-art results.
arXiv Detail & Related papers (2023-06-08T15:17:04Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - An Information Minimization Based Contrastive Learning Model for
Unsupervised Sentence Embeddings Learning [19.270283247740664]
We present an information minimization based contrastive learning (InforMin-CL) model for unsupervised sentence representation learning.
We find that information minimization can be achieved by simple contrast and reconstruction objectives.
arXiv Detail & Related papers (2022-09-22T12:07:35Z) - Conditional Contrastive Learning: Removing Undesirable Information in
Self-Supervised Representations [108.29288034509305]
We develop conditional contrastive learning to remove undesirable information in self-supervised representations.
We demonstrate empirically that our methods can successfully learn self-supervised representations for downstream tasks.
arXiv Detail & Related papers (2021-06-05T10:51:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.