Related papers: Sparsity-Aware Distributed Learning for Gaussian Processes with Linear Multiple Kernel

Sparsity-Aware Distributed Learning for Gaussian Processes with Linear Multiple Kernel

URL: http://arxiv.org/abs/2309.08201v3
Date: Thu, 16 Jan 2025 12:33:37 GMT
Title: Sparsity-Aware Distributed Learning for Gaussian Processes with Linear Multiple Kernel
Authors: Richard Cornelius Suwandi, Zhidi Lin, Feng Yin, Zhiguo Wang, Sergios Theodoridis,
Abstract summary: This paper presents a novel GP linear multiple kernel (LMK) and a generic sparsity-aware distributed learning framework to optimize the hyper- parameters.<n>The newly proposed grid spectral mixture product (GSMP) kernel is tailored for multi-dimensional data.<n>To exploit the inherent sparsity of the solutions, we introduce the Sparse LInear Multiple Kernel Learning (SLIM-KL) framework.
Score: 20.98449975854329
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaussian processes (GPs) stand as crucial tools in machine learning and signal processing, with their effectiveness hinging on kernel design and hyper-parameter optimization. This paper presents a novel GP linear multiple kernel (LMK) and a generic sparsity-aware distributed learning framework to optimize the hyper-parameters. The newly proposed grid spectral mixture product (GSMP) kernel is tailored for multi-dimensional data, effectively reducing the number of hyper-parameters while maintaining good approximation capability. We further demonstrate that the associated hyper-parameter optimization of this kernel yields sparse solutions. To exploit the inherent sparsity of the solutions, we introduce the Sparse LInear Multiple Kernel Learning (SLIM-KL) framework. The framework incorporates a quantized alternating direction method of multipliers (ADMM) scheme for collaborative learning among multiple agents, where the local optimization problem is solved using a distributed successive convex approximation (DSCA) algorithm. SLIM-KL effectively manages large-scale hyper-parameter optimization for the proposed kernel, simultaneously ensuring data privacy and minimizing communication costs. Theoretical analysis establishes convergence guarantees for the learning framework, while experiments on diverse datasets demonstrate the superior prediction performance and efficiency of our proposed methods.

Related papers

Spectral Mixture Kernels for Bayesian Optimization [3.8601741392210434]
We introduce a novel Gaussian Process-based BO method that incorporates spectral mixture kernels.<n>This method achieves a significant improvement in both efficiency and optimization performance.<n>We provide bounds on the information gain and cumulative regret associated with obtaining the optimum.
arXiv Detail & Related papers (2025-05-23T02:07:26Z)
Bayesian Optimization for Simultaneous Selection of Machine Learning Algorithms and Hyperparameters on Shared Latent Space [16.257223975129513]
A machine learning (ML) algorithm and its hyper-parameters is crucial for the development of high-performance ML systems. Many existing studies use Bayesian optimization (BO) for accelerating the search. Our proposed method embeds different hyper- parameter spaces into a shared latent space, in which a surrogate multi-task model for BO is estimated. This approach can share information of observations from different ML algorithms by which efficient optimization is expected with a smaller number of total observations.
arXiv Detail & Related papers (2025-02-13T13:43:52Z)
Constrained Hybrid Metaheuristic Algorithm for Probabilistic Neural Networks Learning [0.3686808512438362]
This study investigates the potential of hybrid metaheuristic algorithms to enhance the training of Probabilistic Neural Networks (PNNs) Traditional learning methods, such as gradient-based approaches, often struggle to optimise high-dimensional and uncertain environments. We propose the constrained Hybrid Metaheuristic (cHM) algorithm, which combines multiple population-based optimisation techniques into a unified framework.
arXiv Detail & Related papers (2025-01-26T19:49:16Z)
Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation [2.3342885570554652]
We introduce a piecewise approximation for process kernels and a corresponding MIQP representation for acquisition functions. We empirically demonstrate the framework on synthetic functions, constrained benchmarks, and hyper tuning tasks.
arXiv Detail & Related papers (2024-10-22T10:56:52Z)
Parameter Optimization with Conscious Allocation (POCA) [4.478575931884855]
Hyperband-based approaches to machine learning are among the most effective. We present. the new. Optimization with Conscious Allocation (POCA), a hyperband-based algorithm that adaptively allocates the inputted. budget to the hyperparameter configurations it generates. POCA finds strong configurations faster in both settings.
arXiv Detail & Related papers (2023-12-29T00:13:55Z)
Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML. This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z)
Supervising the Multi-Fidelity Race of Hyperparameter Configurations [22.408069485293666]
We introduce DyHPO, a Bayesian Optimization method that learns to decide which hyperparameter configuration to train further in a race among all feasible configurations. We demonstrate the significant superiority of DyHPO against state-of-the-art hyperparameter optimization methods through large-scale experiments.
arXiv Detail & Related papers (2022-02-20T10:28:02Z)
Automatic tuning of hyper-parameters of reinforcement learning algorithms using Bayesian optimization with behavioral cloning [0.0]
In reinforcement learning (RL), the information content of data gathered by the learning agent is dependent on the setting of many hyper- parameters. In this work, a novel approach for autonomous hyper- parameter setting using Bayesian optimization is proposed. Experiments reveal promising results compared to other manual tweaking and optimization-based approaches.
arXiv Detail & Related papers (2021-12-15T13:10:44Z)
Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z)
Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications. Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
Adaptive pruning-based optimization of parameterized quantum circuits [62.997667081978825]
Variisy hybrid quantum-classical algorithms are powerful tools to maximize the use of Noisy Intermediate Scale Quantum devices. We propose a strategy for such ansatze used in variational quantum algorithms, which we call "Efficient Circuit Training" (PECT) Instead of optimizing all of the ansatz parameters at once, PECT launches a sequence of variational algorithms.
arXiv Detail & Related papers (2020-10-01T18:14:11Z)
Green Machine Learning via Augmented Gaussian Processes and Multi-Information Source Optimization [0.19116784879310028]
Strategy to drastically reduce computational time and energy consumed is to exploit the availability of different information sources. An Augmented Gaussian Process method exploiting multiple information sources (namely, AGP-MISO) is proposed. A novel acquisition function is defined according to the Augmented Gaussian Process.
arXiv Detail & Related papers (2020-06-25T08:04:48Z)
Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points. The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding. In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.