Hyperparameter Estimation for Sparse Bayesian Learning Models
- URL: http://arxiv.org/abs/2401.02544v1
- Date: Thu, 4 Jan 2024 21:24:01 GMT
- Title: Hyperparameter Estimation for Sparse Bayesian Learning Models
- Authors: Feng Yu and Lixin Shen and Guohui Song
- Abstract summary: Aparse Bayesian Learning (SBL) models are extensively used in signal processing and machine learning for promoting sparsity through hierarchical priors.
This paper presents a framework for the improvement of SBL models for various objective functions.
A novel algorithm is introduced showing enhanced efficiency, especially under signal noise ratios.
- Score: 1.0172874946490507
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Sparse Bayesian Learning (SBL) models are extensively used in signal
processing and machine learning for promoting sparsity through hierarchical
priors. The hyperparameters in SBL models are crucial for the model's
performance, but they are often difficult to estimate due to the non-convexity
and the high-dimensionality of the associated objective function. This paper
presents a comprehensive framework for hyperparameter estimation in SBL models,
encompassing well-known algorithms such as the expectation-maximization (EM),
MacKay, and convex bounding (CB) algorithms. These algorithms are cohesively
interpreted within an alternating minimization and linearization (AML)
paradigm, distinguished by their unique linearized surrogate functions.
Additionally, a novel algorithm within the AML framework is introduced, showing
enhanced efficiency, especially under low signal noise ratios. This is further
improved by a new alternating minimization and quadratic approximation (AMQ)
paradigm, which includes a proximal regularization term. The paper
substantiates these advancements with thorough convergence analysis and
numerical experiments, demonstrating the algorithm's effectiveness in various
noise conditions and signal-to-noise ratios.
Related papers
- Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures [21.18741772731095]
Zeroth-order (ZO) algorithms offer a promising alternative by approximating gradients using finite differences of function values.
Existing ZO methods struggle to capture the low-rank gradient structure common in LLM fine-tuning, leading to suboptimal performance.
This paper proposes a low-rank ZO algorithm (LOZO) that effectively captures this structure in LLMs.
arXiv Detail & Related papers (2024-10-10T08:10:53Z) - Optimization of Iterative Blind Detection based on Expectation Maximization and Belief Propagation [29.114100423416204]
We propose a blind symbol detection for block-fading linear inter-symbol channels.
We design a joint channel estimation and detection scheme that combines the study expectation algorithm and the ubiquitous belief propagation algorithm.
We show that the proposed method can learn efficient schedules that generalize well and even outperform coherent BP detection in high signal-to-noise scenarios.
arXiv Detail & Related papers (2024-08-05T08:45:50Z) - Optimal thresholds and algorithms for a model of multi-modal learning in high dimensions [15.000720880773548]
The paper derives the approximate message passing (AMP) algorithm for this model and characterizes its performance in the high-dimensional limit.
The linearization of AMP is compared numerically to the widely used partial least squares (PLS) and canonical correlation analysis (CCA) methods.
arXiv Detail & Related papers (2024-07-03T21:48:23Z) - Proximal Interacting Particle Langevin Algorithms [0.0]
We introduce Proximal Interacting Particle Langevin Algorithms (PIPLA) for inference and learning in latent variable models.
We propose several variants within the novel proximal IPLA family, tailored to the problem of estimating parameters in a non-differentiable statistical model.
Our theory and experiments together show that PIPLA family can be the de facto choice for parameter estimation problems in latent variable models for non-differentiable models.
arXiv Detail & Related papers (2024-06-20T13:16:41Z) - Improving Sample Efficiency of Model-Free Algorithms for Zero-Sum Markov Games [66.2085181793014]
We show that a model-free stage-based Q-learning algorithm can enjoy the same optimality in the $H$ dependence as model-based algorithms.
Our algorithm features a key novel design of updating the reference value functions as the pair of optimistic and pessimistic value functions.
arXiv Detail & Related papers (2023-08-17T08:34:58Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Optimizing Hyperparameters with Conformal Quantile Regression [7.316604052864345]
We propose to leverage conformalized quantile regression which makes minimal assumptions about the observation noise.
This translates to quicker HPO convergence on empirical benchmarks.
arXiv Detail & Related papers (2023-05-05T15:33:39Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - A Dynamical Systems Approach for Convergence of the Bayesian EM
Algorithm [59.99439951055238]
We show how (discrete-time) Lyapunov stability theory can serve as a powerful tool to aid, or even lead, in the analysis (and potential design) of optimization algorithms that are not necessarily gradient-based.
The particular ML problem that this paper focuses on is that of parameter estimation in an incomplete-data Bayesian framework via the popular optimization algorithm known as maximum a posteriori expectation-maximization (MAP-EM)
We show that fast convergence (linear or quadratic) is achieved, which could have been difficult to unveil without our adopted S&C approach.
arXiv Detail & Related papers (2020-06-23T01:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.