Related papers: Adaptive, Robust and Scalable Bayesian Filtering for Online Learning

Adaptive, Robust and Scalable Bayesian Filtering for Online Learning

URL: http://arxiv.org/abs/2505.07267v1
Date: Mon, 12 May 2025 06:40:29 GMT
Title: Adaptive, Robust and Scalable Bayesian Filtering for Online Learning
Authors: Gerardo Duran-Martin,
Abstract summary: We introduce Bayesian filtering as a principled framework for tackling diverse sequential machine learning problems.<n>This thesis addresses key challenges in applying Bayesian filtering to these problems: adaptivity to non-stationary environments, to model misspecifications and outliers, and scalability to the high-dimensional parameter space of deep neural networks.<n>We develop novel tools within the Bayesian filtering framework to address each of these challenges, including: (i) a modular framework that enables the development adaptive approaches for online learning; (ii) a novel, provably robust filter with similar computational cost to standard filters, that employs Generalised Bayes; and (
Score: 0.5439020425819
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this thesis, we introduce Bayesian filtering as a principled framework for tackling diverse sequential machine learning problems, including online (continual) learning, prequential (one-step-ahead) forecasting, and contextual bandits. To this end, this thesis addresses key challenges in applying Bayesian filtering to these problems: adaptivity to non-stationary environments, robustness to model misspecification and outliers, and scalability to the high-dimensional parameter space of deep neural networks. We develop novel tools within the Bayesian filtering framework to address each of these challenges, including: (i) a modular framework that enables the development adaptive approaches for online learning; (ii) a novel, provably robust filter with similar computational cost to standard filters, that employs Generalised Bayes; and (iii) a set of tools for sequentially updating model parameters using approximate second-order optimisation methods that exploit the overparametrisation of high-dimensional parametric models such as neural networks. Theoretical analysis and empirical results demonstrate the improved performance of our methods in dynamic, high-dimensional, and misspecified models.

Related papers

Deep Equilibrium models for Poisson Imaging Inverse problems via Mirror Descent [7.248102801711294]
Deep Equilibrium Models (DEQs) are implicit neural networks with fixed points.<n>We introduce a novel DEQ formulation based on Mirror Descent defined in terms of a tailored non-Euclidean geometry.<n>We propose computational strategies that enable both efficient training and fully parameter-free inference.
arXiv Detail & Related papers (2025-07-15T16:33:01Z)
Robust Filtering -- Novel Statistical Learning and Inference Algorithms with Applications [0.0]
State estimation serves as a fundamental task to enable intelligent decision-making in applications such as autonomous vehicles, robotics, healthcare monitoring, smart grids, intelligent transportation, and predictive maintenance.<n>Standard filtering assumes prior knowledge of noise statistics to extract latent system states from noisy sensor data.<n>Real-world scenarios involve abnormalities like outliers, biases, drifts, and missing observations with unknown or partially known statistics, limiting conventional approaches.<n>This thesis presents novel robust nonlinear filtering methods to mitigate these challenges.
arXiv Detail & Related papers (2025-06-13T07:30:35Z)
Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction [55.914891182214475]
We introduce neural network reprogrammability as a unifying framework for model adaptation.<n>We present a taxonomy that categorizes such information manipulation approaches across four key dimensions.<n>We also analyze remaining technical challenges and ethical considerations.
arXiv Detail & Related papers (2025-06-05T05:42:27Z)
Variational Bayesian Bow tie Neural Networks with Shrinkage [0.276240219662896]
We build a relaxed version of the standard feed-forward rectified neural network. We employ Polya-Gamma data augmentation tricks to render a conditionally linear and Gaussian model. We derive a variational inference algorithm that avoids distributional assumptions and independence across layers.
arXiv Detail & Related papers (2024-11-17T17:36:30Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference [47.460898983429374]
We introduce an ensemble Kalman filter (EnKF) into the non-mean-field (NMF) variational inference framework to approximate the posterior distribution of the latent states. This novel marriage between EnKF and GPSSM not only eliminates the need for extensive parameterization in learning variational distributions, but also enables an interpretable, closed-form approximation of the evidence lower bound (ELBO) We demonstrate that the resulting EnKF-aided online algorithm embodies a principled objective function by ensuring data-fitting accuracy while incorporating model regularizations to mitigate overfitting.
arXiv Detail & Related papers (2023-12-10T15:22:30Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Deep Variational Models for Collaborative Filtering-based Recommender Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results. Our proposed models apply the variational concept to injectity in the latent space of the deep architecture. Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Multitarget Tracking with Transformers [21.81266872964314]
Multitarget Tracking (MTT) is a problem of tracking the states of an unknown number of objects using noisy measurements. In this paper, we propose a high-performing deep-learning method for MTT based on the Transformer architecture.
arXiv Detail & Related papers (2021-04-01T19:14:55Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.