Related papers: Kalman Filter for Online Classification of Non-Stationary Data

Kalman Filter for Online Classification of Non-Stationary Data

URL: http://arxiv.org/abs/2306.08448v1
Date: Wed, 14 Jun 2023 11:41:42 GMT
Title: Kalman Filter for Online Classification of Non-Stationary Data
Authors: Michalis K. Titsias, Alexandre Galashov, Amal Rannen-Triki, Razvan Pascanu, Yee Whye Teh, Jorg Bornschein
Abstract summary: In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. We introduce a probabilistic Bayesian online learning model by using a neural representation and a state space model over the linear predictor weights. In experiments in multi-class classification we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
Score: 101.26838049872651
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. Important challenges in OCL are concerned with automatic adaptation to the particular non-stationary structure of the data, and with quantification of predictive uncertainty. Motivated by these challenges we introduce a probabilistic Bayesian online learning model by using a (possibly pretrained) neural representation and a state space model over the linear predictor weights. Non-stationarity over the linear predictor weights is modelled using a parameter drift transition density, parametrized by a coefficient that quantifies forgetting. Inference in the model is implemented with efficient Kalman filter recursions which track the posterior distribution over the linear weights, while online SGD updates over the transition dynamics coefficient allows to adapt to the non-stationarity seen in data. While the framework is developed assuming a linear Gaussian model, we also extend it to deal with classification problems and for fine-tuning the deep learning representation. In a set of experiments in multi-class classification using data sets such as CIFAR-100 and CLOC we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.

Related papers

FORT: Forward-Only Regression Training of Normalizing Flows [85.66894616735752]
We revisit classical normalizing flows as one-step generative models with exact likelihoods.<n>We propose a novel, scalable training objective that does not require computing the expensive change of variable formula used in conventional maximum likelihood training.
arXiv Detail & Related papers (2025-06-01T20:32:27Z)
Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data [8.619243141968886]
We present an inference framework for estimating regression coefficients in conditional mean models. We develop an augmented inverse probability weighted (AIPW) method, employing regularized estimators for both propensity score (PS) and outcome regression (OR) models. Our theoretical findings are verified through extensive simulation studies and a real-world data application.
arXiv Detail & Related papers (2024-06-20T00:34:54Z)
Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels. Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z)
Bayes Risk Consistency of Nonparametric Classification Rules for Spike Trains Data [4.047840018793636]
Spike trains data find a growing list of applications in computational neuroscience, imaging, streaming data and finance. Machine learning strategies for spike trains are based on various neural network and probabilistic models. In this paper we consider the two-class statistical classification problem for a class of spike train data characterized by nonparametrically specified intensity functions.
arXiv Detail & Related papers (2023-08-09T08:34:46Z)
Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST) [1.8047694351309207]
We develop a statistical framework for model predictions based on transfer learning, called RECaST. We mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models. We examine our method's performance in a simulation study and in an application to real hospital data.
arXiv Detail & Related papers (2022-11-29T19:39:47Z)
Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them. We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
Recurrent Neural Network Training with Convex Loss and Regularization Functions by Extended Kalman Filtering [0.20305676256390928]
We show that the learning method outperforms gradient descent in a nonlinear system identification benchmark. We also explore the use of the algorithm in data-driven nonlinear model predictive control and its relation with disturbance models for offset-free tracking.
arXiv Detail & Related papers (2021-11-04T07:49:15Z)
KalmanNet: Neural Network Aided Kalman Filtering for Partially Known Dynamics [84.18625250574853]
We present KalmanNet, a real-time state estimator that learns from data to carry out Kalman filtering under non-linear dynamics. We numerically demonstrate that KalmanNet overcomes nonlinearities and model mismatch, outperforming classic filtering methods.
arXiv Detail & Related papers (2021-07-21T12:26:46Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Online Tensor-Based Learning for Multi-Way Data [1.0953917735844645]
A new efficient tensor-based feature extraction, named NeSGD, is proposed for online $CANDECOMP/PARAFAC$ decomposition. Results show that the proposed methods significantly improved the classification error rates, were able to assimilate the changes in the positive data distribution over time, and maintained a high predictive accuracy in all case studies.
arXiv Detail & Related papers (2020-03-10T02:04:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.