Non-Parametric Representation Learning with Kernels
- URL: http://arxiv.org/abs/2309.02028v1
- Date: Tue, 5 Sep 2023 08:14:25 GMT
- Title: Non-Parametric Representation Learning with Kernels
- Authors: Pascal Esser, Maximilian Fleissner, Debarghya Ghoshdastidar
- Abstract summary: We introduce and analyze several kernel-based representation learning approaches.
We argue that the classical representer theorems for supervised kernel machines are not always applicable for (self-supervised) representation learning.
We empirically evaluate the performance of these methods in both small data regimes as well as in comparison with neural network based models.
- Score: 6.944372188747803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised and self-supervised representation learning has become popular
in recent years for learning useful features from unlabelled data.
Representation learning has been mostly developed in the neural network
literature, and other models for representation learning are surprisingly
unexplored. In this work, we introduce and analyze several kernel-based
representation learning approaches: Firstly, we define two kernel
Self-Supervised Learning (SSL) models using contrastive loss functions and
secondly, a Kernel Autoencoder (AE) model based on the idea of embedding and
reconstructing data. We argue that the classical representer theorems for
supervised kernel machines are not always applicable for (self-supervised)
representation learning, and present new representer theorems, which show that
the representations learned by our kernel models can be expressed in terms of
kernel matrices. We further derive generalisation error bounds for
representation learning with kernel SSL and AE, and empirically evaluate the
performance of these methods in both small data regimes as well as in
comparison with neural network based models.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Specify Robust Causal Representation from Mixed Observations [35.387451486213344]
Learning representations purely from observations concerns the problem of learning a low-dimensional, compact representation which is beneficial to prediction models.
We develop a learning method to learn such representation from observational data by regularizing the learning procedure with mutual information measures.
We theoretically and empirically show that the models trained with the learned causal representations are more robust under adversarial attacks and distribution shifts.
arXiv Detail & Related papers (2023-10-21T02:18:35Z) - An Exact Kernel Equivalence for Finite Classification Models [1.4777718769290527]
We compare our exact representation to the well-known Neural Tangent Kernel (NTK) and discuss approximation error relative to the NTK.
We use this exact kernel to show that our theoretical contribution can provide useful insights into the predictions made by neural networks.
arXiv Detail & Related papers (2023-08-01T20:22:53Z) - Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data.
We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel.
We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z) - Entity-Conditioned Question Generation for Robust Attention Distribution
in Neural Information Retrieval [51.53892300802014]
We show that supervised neural information retrieval models are prone to learning sparse attention patterns over passage tokens.
Using a novel targeted synthetic data generation method, we teach neural IR to attend more uniformly and robustly to all entities in a given passage.
arXiv Detail & Related papers (2022-04-24T22:36:48Z) - Multiple Kernel Representation Learning on Networks [12.106994960669924]
We propose a weighted matrix factorization model that encodes random walk-based information about nodes of the network.
We extend the approach with a multiple kernel learning formulation that provides the flexibility of learning the kernel as the linear combination of a dictionary of kernels.
arXiv Detail & Related papers (2021-06-09T13:22:26Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - Self-organizing Democratized Learning: Towards Large-scale Distributed
Learning Systems [71.14339738190202]
democratized learning (Dem-AI) lays out a holistic philosophy with underlying principles for building large-scale distributed and democratized machine learning systems.
Inspired by Dem-AI philosophy, a novel distributed learning approach is proposed in this paper.
The proposed algorithms demonstrate better results in the generalization performance of learning models in agents compared to the conventional FL algorithms.
arXiv Detail & Related papers (2020-07-07T08:34:48Z) - Brain-like approaches to unsupervised learning of hidden representations
-- a comparative study [0.0]
We study the brain-like Bayesian Confidence Propagating Neural Network (BCPNN) model, recently extended to extract sparse distributed high-dimensional representations.
The usefulness and class-dependent separability of the hidden representations when trained on MNIST and Fashion-MNIST datasets is studied.
arXiv Detail & Related papers (2020-05-06T11:20:21Z) - Embedding Graph Auto-Encoder for Graph Clustering [90.8576971748142]
Graph auto-encoder (GAE) models are based on semi-supervised graph convolution networks (GCN)
We design a specific GAE-based model for graph clustering to be consistent with the theory, namely Embedding Graph Auto-Encoder (EGAE)
EGAE consists of one encoder and dual decoders.
arXiv Detail & Related papers (2020-02-20T09:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.