Elastic-Net Multiple Kernel Learning: Combining Multiple Data Sources for Prediction
- URL: http://arxiv.org/abs/2512.11547v1
- Date: Fri, 12 Dec 2025 13:33:17 GMT
- Title: Elastic-Net Multiple Kernel Learning: Combining Multiple Data Sources for Prediction
- Authors: Janaina MourĂ£o-Miranda, Zakria Hussain, Konstantinos Tsirlis, Christophe Phillips, John Shawe-Taylor,
- Abstract summary: elastic-net regularized MKL (ENMKL) is especially valuable when model interpretability is critical.<n>We introduce an alternative ENMKL formulation that yields a simple analytical update for the kernel weights.<n>We implement these ENMKL algorithms against $l1$-norm MKL and against SVM (or KRR) trained on the unweighted sum of kernels.
- Score: 4.293341792967467
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multiple Kernel Learning (MKL) models combine several kernels in supervised and unsupervised settings to integrate multiple data representations or sources, each represented by a different kernel. MKL seeks an optimal linear combination of base kernels that maximizes a generalized performance measure under a regularization constraint. Various norms have been used to regularize the kernel weights, including $l1$, $l2$ and $lp$, as well as the "elastic-net" penalty, which combines $l1$- and $l2$-norm to promote both sparsity and the selection of correlated kernels. This property makes elastic-net regularized MKL (ENMKL) especially valuable when model interpretability is critical and kernels capture correlated information, such as in neuroimaging. Previous ENMKL methods have followed a two-stage procedure: fix kernel weights, train a support vector machine (SVM) with the weighted kernel, and then update the weights via gradient descent, cutting-plane methods, or surrogate functions. Here, we introduce an alternative ENMKL formulation that yields a simple analytical update for the kernel weights. We derive explicit algorithms for both SVM and kernel ridge regression (KRR) under this framework, and implement them in the open-source Pattern Recognition for Neuroimaging Toolbox (PRoNTo). We evaluate these ENMKL algorithms against $l1$-norm MKL and against SVM (or KRR) trained on the unweighted sum of kernels across three neuroimaging applications. Our results show that ENMKL matches or outperforms $l1$-norm MKL in all tasks and only underperforms standard SVM in one scenario. Crucially, ENMKL produces sparser, more interpretable models by selectively weighting correlated kernels.
Related papers
- Multiple Kernel Clustering via Local Regression Integration [4.856913393644719]
Multiple kernel methods less consider the intrinsic manifold structure of multiple kernel data.
This paper first presents the clustering method via kernelized local regression (CKLR)
We then extend it to perform clustering via the multiple kernel local regression (CMKLR)
arXiv Detail & Related papers (2024-10-20T06:26:29Z) - DKL-KAN: Scalable Deep Kernel Learning using Kolmogorov-Arnold Networks [0.0]
We introduce a scalable deep kernel using KAN (DKL-KAN) as an effective alternative to DKL using DKL-MLP.
We analyze two variants of DKL-KAN for a fair comparison with DKL-MLP.
The efficacy of DKL-KAN is evaluated in terms of computational training time and test prediction accuracy across a wide range of applications.
arXiv Detail & Related papers (2024-07-30T20:30:44Z) - On the Identifiability and Interpretability of Gaussian Process Models [8.417178903130244]
We critically examine the prevalent practice of using additive mixtures of Mat'ern kernels in single-output Gaussian process (GP) models.
We show that the smoothness of a mixture of Mat'ern kernels is determined by the least smooth component and that a GP with such a kernel is effectively equivalent to the least smooth kernel component.
We show that $A$ is identifiable up to a multiplicative constant, suggesting that multiplicative mixtures are well suited for multi-output tasks.
arXiv Detail & Related papers (2023-10-25T22:00:29Z) - Local Sample-weighted Multiple Kernel Clustering with Consensus
Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.
This paper proposes a novel local sample-weighted multiple kernel clustering model.
Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z) - Taming Nonconvexity in Kernel Feature Selection---Favorable Properties
of the Laplace Kernel [77.73399781313893]
A challenge is to establish the objective function of kernel-based feature selection.
The gradient-based algorithms available for non-global optimization are only able to guarantee convergence to local minima.
arXiv Detail & Related papers (2021-06-17T11:05:48Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Quantum Multiple Kernel Learning [1.9116668545881028]
Kernel methods play an important role in machine learning applications due to their conceptual simplicity and superior performance.
One approach to enhancing the expressivity of kernel machines is to combine multiple individual kernels.
We propose quantum MKL, which combines multiple quantum kernels.
arXiv Detail & Related papers (2020-11-19T07:19:41Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Isolation Distributional Kernel: A New Tool for Point & Group Anomaly
Detection [76.1522587605852]
Isolation Distributional Kernel (IDK) is a new way to measure the similarity between two distributions.
We demonstrate IDK's efficacy and efficiency as a new tool for kernel based anomaly detection for both point and group anomalies.
arXiv Detail & Related papers (2020-09-24T12:25:43Z) - SimpleMKKM: Simple Multiple Kernel K-means [49.500663154085586]
We propose a simple yet effective multiple kernel clustering algorithm, termed simple multiple kernel k-means (SimpleMKKM)
Our criterion is given by an intractable minimization-maximization problem in the kernel coefficient and clustering partition matrix.
We theoretically analyze the performance of SimpleMKKM in terms of its clustering generalization error.
arXiv Detail & Related papers (2020-05-11T10:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.