End-to-end Kernel Learning via Generative Random Fourier Features
- URL: http://arxiv.org/abs/2009.04614v5
- Date: Tue, 16 Jan 2024 02:54:58 GMT
- Title: End-to-end Kernel Learning via Generative Random Fourier Features
- Authors: Kun Fang, Fanghui Liu, Xiaolin Huang and Jie Yang
- Abstract summary: Random Fourier features (RFFs) provide a promising way for kernel learning in a spectral case.
In this paper, we consider a one-stage process that incorporates the kernel learning and linear learner into a unifying framework.
- Score: 31.57596752889935
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Random Fourier features (RFFs) provide a promising way for kernel learning in
a spectral case. Current RFFs-based kernel learning methods usually work in a
two-stage way. In the first-stage process, learning the optimal feature map is
often formulated as a target alignment problem, which aims to align the learned
kernel with the pre-defined target kernel (usually the ideal kernel). In the
second-stage process, a linear learner is conducted with respect to the mapped
random features. Nevertheless, the pre-defined kernel in target alignment is
not necessarily optimal for the generalization of the linear learner. Instead,
in this paper, we consider a one-stage process that incorporates the kernel
learning and linear learner into a unifying framework. To be specific, a
generative network via RFFs is devised to implicitly learn the kernel, followed
by a linear classifier parameterized as a full-connected layer. Then the
generative network and the classifier are jointly trained by solving the
empirical risk minimization (ERM) problem to reach a one-stage solution. This
end-to-end scheme naturally allows deeper features, in correspondence to a
multi-layer structure, and shows superior generalization performance over the
classical two-stage, RFFs-based methods in real-world classification tasks.
Moreover, inspired by the randomized resampling mechanism of the proposed
method, its enhanced adversarial robustness is investigated and experimentally
verified.
Related papers
- Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation [46.5310645609264]
We propose a Meta-learning and Markov Chain Monte Carlo based SISR approach to learn kernel priors from organized randomness.
A lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions.
A meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer.
arXiv Detail & Related papers (2024-06-13T07:50:15Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - RFFNet: Large-Scale Interpretable Kernel Methods via Random Fourier Features [3.0079490585515347]
We introduce RFFNet, a scalable method that learns the kernel relevances' on the fly via first-order optimization.
We show that our approach has a small memory footprint and run-time, low prediction error, and effectively identifies relevant features.
We supply users with an efficient, PyTorch-based library, that adheres to the scikit-learn standard API and code for fully reproducing our results.
arXiv Detail & Related papers (2022-11-11T18:50:34Z) - Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data.
We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel.
We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z) - Learning "best" kernels from data in Gaussian process regression. With
application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques.
A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning.
A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Kernel Continual Learning [117.79080100313722]
kernel continual learning is a simple but effective variant of continual learning to tackle catastrophic forgetting.
episodic memory unit stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression.
variational random features to learn a data-driven kernel for each task.
arXiv Detail & Related papers (2021-07-12T22:09:30Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Domain Adaptive Learning Based on Sample-Dependent and Learnable Kernels [2.1485350418225244]
This paper proposes a Domain Adaptive Learning method based on Sample-Dependent and Learnable Kernels (SDLK-DAL)
The first contribution of our work is to propose a sample-dependent and learnable Positive Definite Quadratic Kernel function (PDQK) framework.
We conduct a series of experiments that the RKHS determined by PDQK replaces those in several state-of-the-art DAL algorithms, and our approach achieves better performance.
arXiv Detail & Related papers (2021-02-18T13:55:06Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.