Enhancing Kernel Flexibility via Learning Asymmetric Locally-Adaptive
Kernels
- URL: http://arxiv.org/abs/2310.05236v1
- Date: Sun, 8 Oct 2023 17:08:15 GMT
- Title: Enhancing Kernel Flexibility via Learning Asymmetric Locally-Adaptive
Kernels
- Authors: Fan He, Mingzhen He, Lei Shi, Xiaolin Huang and Johan A.K. Suykens
- Abstract summary: This paper introduces the concept of Locally-Adaptive-Bandwidths (LAB) as trainable parameters to enhance the Radial Basis Function (RBF) kernel.
The parameters in LAB RBF kernels are data-dependent, and its number can increase with the dataset.
This paper for the first time establishes an asymmetric kernel ridge regression framework and introduces an iterative kernel learning algorithm.
- Score: 35.76925787221499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The lack of sufficient flexibility is the key bottleneck of kernel-based
learning that relies on manually designed, pre-given, and non-trainable
kernels. To enhance kernel flexibility, this paper introduces the concept of
Locally-Adaptive-Bandwidths (LAB) as trainable parameters to enhance the Radial
Basis Function (RBF) kernel, giving rise to the LAB RBF kernel. The parameters
in LAB RBF kernels are data-dependent, and its number can increase with the
dataset, allowing for better adaptation to diverse data patterns and enhancing
the flexibility of the learned function. This newfound flexibility also brings
challenges, particularly with regards to asymmetry and the need for an
efficient learning algorithm. To address these challenges, this paper for the
first time establishes an asymmetric kernel ridge regression framework and
introduces an iterative kernel learning algorithm. This novel approach not only
reduces the demand for extensive support data but also significantly improves
generalization by training bandwidths on the available training data.
Experimental results on real datasets underscore the remarkable performance of
the proposed algorithm, showcasing its superior capability in handling
large-scale datasets compared to Nystr\"om approximation-based algorithms.
Moreover, it demonstrates a significant improvement in regression accuracy over
existing kernel-based learning methods and even surpasses residual neural
networks.
Related papers
- Kernel Sum of Squares for Data Adapted Kernel Learning of Dynamical Systems from Data: A global optimization approach [0.19999259391104385]
This paper examines the application of the Kernel Sum of Squares (KSOS) method for enhancing kernel learning from data.
Traditional kernel-based methods frequently struggle with selecting optimal base kernels and parameter tuning.
KSOS mitigates these issues by leveraging a global optimization framework with kernel-based surrogate functions.
arXiv Detail & Related papers (2024-08-12T19:32:28Z) - Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning [88.78080749909665]
Current on-device training methods just focus on efficient training without considering the catastrophic forgetting.
This paper proposes a simple but effective edge-friendly incremental learning framework.
Our method achieves average accuracy boost of 38.08% with even less memory and approximate computation.
arXiv Detail & Related papers (2024-06-13T05:49:29Z) - Efficient kernel surrogates for neural network-based regression [0.8030359871216615]
We study the performance of the Conjugate Kernel (CK), an efficient approximation to the Neural Tangent Kernel (NTK)
We show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior.
In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework suggests a recipe for improving DNN accuracy inexpensively.
arXiv Detail & Related papers (2023-10-28T06:41:47Z) - Analysis and Optimization of Wireless Federated Learning with Data
Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation.
We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE)
Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - RFFNet: Large-Scale Interpretable Kernel Methods via Random Fourier Features [3.0079490585515347]
We introduce RFFNet, a scalable method that learns the kernel relevances' on the fly via first-order optimization.
We show that our approach has a small memory footprint and run-time, low prediction error, and effectively identifies relevant features.
We supply users with an efficient, PyTorch-based library, that adheres to the scikit-learn standard API and code for fully reproducing our results.
arXiv Detail & Related papers (2022-11-11T18:50:34Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Finite Versus Infinite Neural Networks: an Empirical Study [69.07049353209463]
kernel methods outperform fully-connected finite-width networks.
Centered and ensembled finite networks have reduced posterior variance.
Weight decay and the use of a large learning rate break the correspondence between finite and infinite networks.
arXiv Detail & Related papers (2020-07-31T01:57:47Z) - Fast Estimation of Information Theoretic Learning Descriptors using
Explicit Inner Product Spaces [4.5497405861975935]
Kernel methods form a theoretically-grounded, powerful and versatile framework to solve nonlinear problems in signal processing and machine learning.
Recently, we proposed emphno-trick (NT) kernel adaptive filtering (KAF)
We focus on a family of fast, scalable, and accurate estimators for ITL using explicit inner product space kernels.
arXiv Detail & Related papers (2020-01-01T20:21:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.