Functional Regularization for Representation Learning: A Unified
Theoretical Perspective
- URL: http://arxiv.org/abs/2008.02447v3
- Date: Thu, 22 Oct 2020 00:54:57 GMT
- Title: Functional Regularization for Representation Learning: A Unified
Theoretical Perspective
- Authors: Siddhant Garg, Yingyu Liang
- Abstract summary: Unsupervised and self-supervised learning approaches have become a crucial tool to learn representations for downstream prediction tasks.
We present a unifying perspective where several such approaches can be viewed as imposing a regularization on the representation via a learnable function using unlabeled data.
We propose a discriminative theoretical framework for analyzing the sample complexity of these approaches, which generalizes the framework of (Balcan and Blum, 2010) to allow learnable regularization functions.
- Score: 27.93916012334704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unsupervised and self-supervised learning approaches have become a crucial
tool to learn representations for downstream prediction tasks. While these
approaches are widely used in practice and achieve impressive empirical gains,
their theoretical understanding largely lags behind. Towards bridging this gap,
we present a unifying perspective where several such approaches can be viewed
as imposing a regularization on the representation via a learnable function
using unlabeled data. We propose a discriminative theoretical framework for
analyzing the sample complexity of these approaches, which generalizes the
framework of (Balcan and Blum, 2010) to allow learnable regularization
functions. Our sample complexity bounds show that, with carefully chosen
hypothesis classes to exploit the structure in the data, these learnable
regularization functions can prune the hypothesis space, and help reduce the
amount of labeled data needed. We then provide two concrete examples of
functional regularization, one using auto-encoders and the other using masked
self-supervision, and apply our framework to quantify the reduction in the
sample complexity bound of labeled data. We also provide complementary
empirical results to support our analysis.
Related papers
- Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Spectral Regularization Allows Data-frugal Learning over Combinatorial
Spaces [13.36217184117654]
We show that regularizing the spectral representation of machine learning models improves their generalization power when labeled data is scarce.
Running gradient descent on the regularized loss results in a better generalization performance compared to baseline algorithms in several data-scarce real-world problems.
arXiv Detail & Related papers (2022-10-05T23:31:54Z) - Self-Supervised Consistent Quantization for Fully Unsupervised Image
Retrieval [17.422973861218182]
Unsupervised image retrieval aims to learn an efficient retrieval system without expensive data annotations.
Recent advance proposes deep fully unsupervised image retrieval aiming at training a deep model from scratch to jointly optimize visual features and quantization codes.
We propose a novel self-supervised consistent quantization approach to deep fully unsupervised image retrieval, which consists of part consistent quantization and global consistent quantization.
arXiv Detail & Related papers (2022-06-20T14:39:59Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning.
Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - Towards the Generalization of Contrastive Self-Supervised Learning [11.889992921445849]
We present a theoretical explanation of how contrastive self-supervised pre-trained models generalize to downstream tasks.
We further explore SimCLR and Barlow Twins, which are two canonical contrastive self-supervised methods.
arXiv Detail & Related papers (2021-11-01T07:39:38Z) - Fair Representation Learning using Interpolation Enabled Disentanglement [9.043741281011304]
We propose a novel method to address two key issues: (a) Can we simultaneously learn fair disentangled representations while ensuring the utility of the learned representation for downstream tasks, and (b)Can we provide theoretical insights into when the proposed approach will be both fair and accurate.
To address the former, we propose the method FRIED, Fair Representation learning using Interpolation Enabled Disentanglement.
arXiv Detail & Related papers (2021-07-31T17:32:12Z) - Which Mutual-Information Representation Learning Objectives are
Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data.
This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy.
Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.