Gaussian-Mixture-Model Q-Functions for Reinforcement Learning by Riemannian Optimization
- URL: http://arxiv.org/abs/2409.04374v2
- Date: Tue, 10 Sep 2024 05:51:18 GMT
- Title: Gaussian-Mixture-Model Q-Functions for Reinforcement Learning by Riemannian Optimization
- Authors: Minh Vu, Konstantinos Slavakis,
- Abstract summary: This paper establishes a novel role for Gaussian-mixture models (GMMs) as functional approximators of Q-function losses in reinforcement learning (RL)
- Score: 4.192712667327955
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper establishes a novel role for Gaussian-mixture models (GMMs) as functional approximators of Q-function losses in reinforcement learning (RL). Unlike the existing RL literature, where GMMs play their typical role as estimates of probability density functions, GMMs approximate here Q-function losses. The new Q-function approximators, coined GMM-QFs, are incorporated in Bellman residuals to promote a Riemannian-optimization task as a novel policy-evaluation step in standard policy-iteration schemes. The paper demonstrates how the hyperparameters (means and covariance matrices) of the Gaussian kernels are learned from the data, opening thus the door of RL to the powerful toolbox of Riemannian optimization. Numerical tests show that with no use of experienced data, the proposed design outperforms state-of-the-art methods, even deep Q-networks which use experienced data, on benchmark RL tasks.
Related papers
- Gaussian-Mixture-Model Q-Functions for Policy Iteration in Reinforcement Learning [7.056697401102689]
This paper introduces a novel function-approximation role for Gaussian mixture models (GMMs) as direct surrogates for Q-function losses.<n>These parametric models, termed GMM-QFs, possess substantial representational capacity.<n>They are shown to be universal approximators over a broad class of functions.
arXiv Detail & Related papers (2025-12-21T15:00:32Z) - Online reinforcement learning via sparse Gaussian mixture model Q-functions [7.056697401102689]
This paper introduces a structured and interpretable online policy-iteration framework for reinforcement learning (RL)<n>It is built around the novel class of sparse Gaussian mixture model Q-functions (S-GMM-QFs)<n> Numerical tests show that S-GMM-QFs match the performance of dense deep RL (DeepRL) methods on standard benchmarks.
arXiv Detail & Related papers (2025-09-18T03:37:11Z) - Deep Equilibrium models for Poisson Imaging Inverse problems via Mirror Descent [7.248102801711294]
Deep Equilibrium Models (DEQs) are implicit neural networks with fixed points.<n>We introduce a novel DEQ formulation based on Mirror Descent defined in terms of a tailored non-Euclidean geometry.<n>We propose computational strategies that enable both efficient training and fully parameter-free inference.
arXiv Detail & Related papers (2025-07-15T16:33:01Z) - Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z) - Generative Diffusion Models for Resource Allocation in Wireless Networks [77.36145730415045]
We train a policy to imitate an expert and generate new samples from the optimal distribution.<n>We achieve near-optimal performance through the sequential execution of the generated samples.<n>We present numerical results in a case study of power control.
arXiv Detail & Related papers (2025-04-28T21:44:31Z) - SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning [89.04776523010409]
This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics.
In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward mapping.
We establish the first convergence analysis with provable generalization guarantees for SF-DQN with GPI.
arXiv Detail & Related papers (2024-05-24T20:30:14Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Sparse Gaussian Process Hyperparameters: Optimize or Integrate? [5.949779668853556]
We propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior.
We compare this scheme against natural baselines in literature along with variational GPs (SVGPs) along with an extensive computational analysis.
arXiv Detail & Related papers (2022-11-04T14:06:59Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Missing Data Imputation and Acquisition with Deep Hierarchical Models
and Hamiltonian Monte Carlo [2.666288135543677]
We present HH-VAEM, a Hierarchical VAE model for mixed-type incomplete data.
Our experiments show that HH-VAEM outperforms existing baselines in the tasks of missing data imputation, supervised learning and outlier identification.
We also present a sampling-based approach for efficiently computing the information gain when missing features are to be acquired with HH-VAEM.
arXiv Detail & Related papers (2022-02-09T17:50:52Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Learning Nonparametric Volterra Kernels with Gaussian Processes [0.0]
This paper introduces a method for the nonparametric Bayesian learning of nonlinear operators, through the use of the Volterra series with kernels represented using Gaussian processes (GPs)
When the input function to the operator is unobserved and has a GP prior, the NVKM constitutes a powerful method for both single and multiple output regression, and can be viewed as a nonlinear and nonparametric latent force model.
arXiv Detail & Related papers (2021-06-10T08:21:00Z) - On MCMC for variationally sparse Gaussian processes: A pseudo-marginal
approach [0.76146285961466]
Gaussian processes (GPs) are frequently used in machine learning and statistics to construct powerful models.
We propose a pseudo-marginal (PM) scheme that offers exact inference as well as computational gains through doubly estimators for the likelihood and large datasets.
arXiv Detail & Related papers (2021-03-04T20:48:29Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Marginalised Gaussian Processes with Nested Sampling [10.495114898741203]
Gaussian Process (GPs) models are a rich distribution over functions with inductive biases controlled by a kernel function.
This work presents an alternative learning procedure where the hyperparameters of the kernel function are marginalised using Nested Sampling (NS)
arXiv Detail & Related papers (2020-10-30T16:04:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.