online and lightweight kernel-based approximated policy iteration for
dynamic p-norm linear adaptive filtering
- URL: http://arxiv.org/abs/2210.11755v1
- Date: Fri, 21 Oct 2022 06:29:01 GMT
- Title: online and lightweight kernel-based approximated policy iteration for
dynamic p-norm linear adaptive filtering
- Authors: Yuki Akiyama, Minh Vu, Konstantinos Slavakis
- Abstract summary: This paper introduces a solution to the problem of selecting dynamically (online) the optimal'' p-norm to combat outliers in linear adaptive filtering.
The proposed framework is built on kernel-based reinforcement learning (KBRL)
- Score: 8.319127681936815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces a solution to the problem of selecting dynamically
(online) the ``optimal'' p-norm to combat outliers in linear adaptive filtering
without any knowledge on the probability density function of the outliers. The
proposed online and data-driven framework is built on kernel-based
reinforcement learning (KBRL). To this end, novel Bellman mappings on
reproducing kernel Hilbert spaces (RKHSs) are introduced. These mappings do not
require any knowledge on transition probabilities of Markov decision processes,
and are nonexpansive with respect to the underlying Hilbertian norm. The
fixed-point sets of the proposed Bellman mappings are utilized to build an
approximate policy-iteration (API) framework for the problem at hand. To
address the ``curse of dimensionality'' in RKHSs, random Fourier features are
utilized to bound the computational complexity of the API. Numerical tests on
synthetic data for several outlier scenarios demonstrate the superior
performance of the proposed API framework over several non-RL and KBRL schemes.
Related papers
- Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Nonparametric Bellman Mappings for Reinforcement Learning: Application to Robust Adaptive Filtering [3.730504020733928]
This paper designs novel nonparametric Bellman mappings in reproducing kernel Hilbert spaces (RKHSs) for reinforcement learning (RL)
The proposed mappings benefit from the rich approximating properties of RKHSs, adopt no assumptions on the statistics of the data owing to their nonparametric nature, and may operate without any training data.
As an application, the proposed mappings are employed to offer a novel solution to the problem of countering outliers in adaptive filtering.
arXiv Detail & Related papers (2024-03-29T07:15:30Z) - An Alternate View on Optimal Filtering in an RKHS [0.0]
Adaptive Filtering (KAF) are mathematically principled methods which search for a function in a Reproducing Kernel Space.
They are plagued by a linear relationship between number of training samples and model size, hampering their use on the very large data sets common in today's data saturated world.
We describe a novel view of optimal filtering which may provide a route towards solutions in a RKHS which do not necessarily have this linear growth in model size.
arXiv Detail & Related papers (2023-12-19T16:43:17Z) - Proximal Bellman mappings for reinforcement learning and their
application to robust adaptive filtering [4.140907550856865]
This paper introduces the novel class of Bellman mappings.
The mappings are defined in reproducing kernel Hilbert spaces.
An approximate policy-iteration scheme is built on the proposed class of mappings.
arXiv Detail & Related papers (2023-09-14T09:20:21Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Dynamic selection of p-norm in linear adaptive filtering via online
kernel-based reinforcement learning [8.319127681936815]
This study addresses the problem of selecting dynamically, at each time instance, the optimal'' p-norm to combat outliers in linear adaptive filtering.
Online and data-driven framework is designed via kernel-based reinforcement learning (KBRL)
arXiv Detail & Related papers (2022-10-20T14:49:39Z) - Youla-REN: Learning Nonlinear Feedback Policies with Robust Stability
Guarantees [5.71097144710995]
This paper presents a parameterization of nonlinear controllers for uncertain systems building on a recently developed neural network architecture.
The proposed framework has "built-in" guarantees of stability, i.e., all policies in the search space result in a contracting (globally exponentially stable) closed-loop system.
arXiv Detail & Related papers (2021-12-02T13:52:37Z) - Solving Multistage Stochastic Linear Programming via Regularized Linear
Decision Rules: An Application to Hydrothermal Dispatch Planning [77.34726150561087]
We propose a novel regularization scheme for linear decision rules (LDR) based on the AdaSO (adaptive least absolute shrinkage and selection operator)
Experiments show that the overfit threat is non-negligible when using the classical non-regularized LDR to solve MSLP.
For the LHDP problem, our analysis highlights the following benefits of the proposed framework in comparison to the non-regularized benchmark.
arXiv Detail & Related papers (2021-10-07T02:36:14Z) - Sample-Efficient Reinforcement Learning Is Feasible for Linearly
Realizable MDPs with Limited Revisiting [60.98700344526674]
Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning.
In this paper, we investigate a new sampling protocol, which draws samples in an online/exploratory fashion but allows one to backtrack and revisit previous states in a controlled and infrequent manner.
We develop an algorithm tailored to this setting, achieving a sample complexity that scales practicallyly with the feature dimension, the horizon, and the inverse sub-optimality gap, but not the size of the state/action space.
arXiv Detail & Related papers (2021-05-17T17:22:07Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.