Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression
- URL: http://arxiv.org/abs/2306.00788v3
- Date: Thu, 18 Jan 2024 16:00:30 GMT
- Title: Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression
- Authors: Runtian Zhai, Bingbin Liu, Andrej Risteski, Zico Kolter, Pradeep
Ravikumar
- Abstract summary: Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
- Score: 53.15502562048627
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data augmentation is critical to the empirical success of modern
self-supervised representation learning, such as contrastive learning and
masked language modeling. However, a theoretical understanding of the exact
role of augmentation remains limited. Recent work has built the connection
between self-supervised learning and the approximation of the top eigenspace of
a graph Laplacian operator, suggesting that learning a linear probe atop such
representation can be connected to RKHS regression. Building on this insight,
this work delves into a statistical analysis of augmentation-based pretraining.
Starting from the isometry property, a geometric characterization of the target
function given by the augmentation, we disentangle the effects of the model and
the augmentation, and prove two generalization bounds that are free of model
complexity. Our first bound works for an arbitrary encoder, where the
prediction error is decomposed as the sum of an estimation error incurred by
fitting a linear probe with RKHS regression, and an approximation error
entailed by RKHS approximation. Our second bound specifically addresses the
case where the encoder is near-optimal, that is it approximates the top-d
eigenspace of the RKHS induced by the augmentation. A key ingredient in our
analysis is the augmentation complexity, which we use to quantitatively compare
different augmentations and analyze their impact on downstream performance.
Related papers
- Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Understanding, Predicting and Better Resolving Q-Value Divergence in
Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.
We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training.
For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z) - Koopman Kernel Regression [6.116741319526748]
We show that Koopman operator theory offers a beneficial paradigm for characterizing forecasts via linear time-invariant (LTI) ODEs.
We derive a universal Koopman-invariant kernel reproducing Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems.
Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors.
arXiv Detail & Related papers (2023-05-25T16:22:22Z) - Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning [47.904127007515925]
We study a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction.
We prove that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic approximation guarantees as their counterparts.
Notably, these are the first finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling.
arXiv Detail & Related papers (2023-01-03T04:09:38Z) - ER: Equivariance Regularizer for Knowledge Graph Completion [107.51609402963072]
We propose a new regularizer, namely, Equivariance Regularizer (ER)
ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities.
The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-24T08:18:05Z) - Parameterized Hypercomplex Graph Neural Networks for Graph
Classification [1.1852406625172216]
We develop graph neural networks that leverage the properties of hypercomplex feature transformation.
In particular, in our proposed class of models, the multiplication rule specifying the algebra itself is inferred from the data during training.
We test our proposed hypercomplex GNN on several open graph benchmark datasets and show that our models reach state-of-the-art performance.
arXiv Detail & Related papers (2021-03-30T18:01:06Z) - A Locally Adaptive Interpretable Regression [7.4267694612331905]
Linear regression is one of the most interpretable prediction models.
In this work, we introduce a locally adaptive interpretable regression (LoAIR)
Our model achieves comparable or better predictive performance than the other state-of-the-art baselines.
arXiv Detail & Related papers (2020-05-07T09:26:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.