Self-Supervised Representation Learning with Meta Comprehensive
Regularization
- URL: http://arxiv.org/abs/2403.01549v1
- Date: Sun, 3 Mar 2024 15:53:48 GMT
- Title: Self-Supervised Representation Learning with Meta Comprehensive
Regularization
- Authors: Huijie Guo, Ying Ba, Jie Hu, Lingyu Si, Wenwen Qiang, Lei Shi
- Abstract summary: We introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks.
We update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features.
We provide theoretical support for our proposed method from information theory and causal counterfactual perspective.
- Score: 11.387994024747842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-Supervised Learning (SSL) methods harness the concept of semantic
invariance by utilizing data augmentation strategies to produce similar
representations for different deformations of the same input. Essentially, the
model captures the shared information among multiple augmented views of
samples, while disregarding the non-shared information that may be beneficial
for downstream tasks. To address this issue, we introduce a module called
CompMod with Meta Comprehensive Regularization (MCR), embedded into existing
self-supervised frameworks, to make the learned representations more
comprehensive. Specifically, we update our proposed model through a bi-level
optimization mechanism, enabling it to capture comprehensive features.
Additionally, guided by the constrained extraction of features using maximum
entropy coding, the self-supervised learning model learns more comprehensive
features on top of learning consistent features. In addition, we provide
theoretical support for our proposed method from information theory and causal
counterfactual perspective. Experimental results show that our method achieves
significant improvement in classification, object detection and instance
segmentation tasks on multiple benchmark datasets.
Related papers
- A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - R-Cut: Enhancing Explainability in Vision Transformers with Relationship
Weighted Out and Cut [14.382326829600283]
We introduce two modules: the Relationship Weighted Out" and the Cut" modules.
The Cut" module performs fine-grained feature decomposition, taking into account factors such as position, texture, and color.
We validate our method with extensive qualitative and quantitative experiments on the ImageNet dataset.
arXiv Detail & Related papers (2023-07-18T08:03:51Z) - RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation.
Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal.
We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - Mutual Information Regularization for Weakly-supervised RGB-D Salient
Object Detection [33.210575826086654]
We present a weakly-supervised RGB-D salient object detection model via supervision.
We focus on effective multimodal representation learning via inter-modal mutual information regularization.
arXiv Detail & Related papers (2023-06-06T12:36:57Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Multi-view Information Bottleneck Without Variational Approximation [34.877573432746246]
We extend the information bottleneck principle to a supervised multi-view learning scenario.
We use the recently proposed matrix-based R'enyi's $alpha$-order entropy functional to optimize the resulting objective.
Empirical results in both synthetic and real-world datasets suggest that our method enjoys improved robustness to noise and redundant information in each view.
arXiv Detail & Related papers (2022-04-22T06:48:04Z) - CAD: Co-Adapting Discriminative Features for Improved Few-Shot
Classification [11.894289991529496]
Few-shot classification is a challenging problem that aims to learn a model that can adapt to unseen classes given a few labeled samples.
Recent approaches pre-train a feature extractor, and then fine-tune for episodic meta-learning.
We propose a strategy to cross-attend and re-weight discriminative features for few-shot classification.
arXiv Detail & Related papers (2022-03-25T06:14:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.