Kalman Filter for Online Classification of Non-Stationary Data
- URL: http://arxiv.org/abs/2306.08448v1
- Date: Wed, 14 Jun 2023 11:41:42 GMT
- Title: Kalman Filter for Online Classification of Non-Stationary Data
- Authors: Michalis K. Titsias, Alexandre Galashov, Amal Rannen-Triki, Razvan
Pascanu, Yee Whye Teh, Jorg Bornschein
- Abstract summary: In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps.
We introduce a probabilistic Bayesian online learning model by using a neural representation and a state space model over the linear predictor weights.
In experiments in multi-class classification we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
- Score: 101.26838049872651
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Online Continual Learning (OCL) a learning system receives a stream of
data and sequentially performs prediction and training steps. Important
challenges in OCL are concerned with automatic adaptation to the particular
non-stationary structure of the data, and with quantification of predictive
uncertainty. Motivated by these challenges we introduce a probabilistic
Bayesian online learning model by using a (possibly pretrained) neural
representation and a state space model over the linear predictor weights.
Non-stationarity over the linear predictor weights is modelled using a
parameter drift transition density, parametrized by a coefficient that
quantifies forgetting. Inference in the model is implemented with efficient
Kalman filter recursions which track the posterior distribution over the linear
weights, while online SGD updates over the transition dynamics coefficient
allows to adapt to the non-stationarity seen in data. While the framework is
developed assuming a linear Gaussian model, we also extend it to deal with
classification problems and for fine-tuning the deep learning representation.
In a set of experiments in multi-class classification using data sets such as
CIFAR-100 and CLOC we demonstrate the predictive ability of the model and its
flexibility to capture non-stationarity.
Related papers
- Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data [8.619243141968886]
We present an inference framework for estimating regression coefficients in conditional mean models.
We develop an augmented inverse probability weighted (AIPW) method, employing regularized estimators for both propensity score (PS) and outcome regression (OR) models.
Our theoretical findings are verified through extensive simulation studies and a real-world data application.
arXiv Detail & Related papers (2024-06-20T00:34:54Z) - Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - Bayes Risk Consistency of Nonparametric Classification Rules for Spike
Trains Data [4.047840018793636]
Spike trains data find a growing list of applications in computational neuroscience, imaging, streaming data and finance.
Machine learning strategies for spike trains are based on various neural network and probabilistic models.
In this paper we consider the two-class statistical classification problem for a class of spike train data characterized by nonparametrically specified intensity functions.
arXiv Detail & Related papers (2023-08-09T08:34:46Z) - Transfer Learning with Uncertainty Quantification: Random Effect
Calibration of Source to Target (RECaST) [1.8047694351309207]
We develop a statistical framework for model predictions based on transfer learning, called RECaST.
We mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models.
We examine our method's performance in a simulation study and in an application to real hospital data.
arXiv Detail & Related papers (2022-11-29T19:39:47Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Recurrent Neural Network Training with Convex Loss and Regularization
Functions by Extended Kalman Filtering [0.20305676256390928]
We show that the learning method outperforms gradient descent in a nonlinear system identification benchmark.
We also explore the use of the algorithm in data-driven nonlinear model predictive control and its relation with disturbance models for offset-free tracking.
arXiv Detail & Related papers (2021-11-04T07:49:15Z) - KalmanNet: Neural Network Aided Kalman Filtering for Partially Known
Dynamics [84.18625250574853]
We present KalmanNet, a real-time state estimator that learns from data to carry out Kalman filtering under non-linear dynamics.
We numerically demonstrate that KalmanNet overcomes nonlinearities and model mismatch, outperforming classic filtering methods.
arXiv Detail & Related papers (2021-07-21T12:26:46Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Online Tensor-Based Learning for Multi-Way Data [1.0953917735844645]
A new efficient tensor-based feature extraction, named NeSGD, is proposed for online $CANDECOMP/PARAFAC$ decomposition.
Results show that the proposed methods significantly improved the classification error rates, were able to assimilate the changes in the positive data distribution over time, and maintained a high predictive accuracy in all case studies.
arXiv Detail & Related papers (2020-03-10T02:04:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.