Related papers: Supervised Learning in the Presence of Concept Drift: A modelling framework

Supervised Learning in the Presence of Concept Drift: A modelling framework

URL: http://arxiv.org/abs/2005.10531v2
Date: Sat, 27 Feb 2021 20:45:45 GMT
Title: Supervised Learning in the Presence of Concept Drift: A modelling framework
Authors: Michiel Straat, Fthi Abadi, Zhuoyun Kan, Christina G\"opfert, Barbara Hammer, Michael Biehl
Abstract summary: We present a modelling framework for the investigation of supervised learning in non-stationary environments. We model two example types of learning systems: prototype-based Learning Vector Quantization (LVQ) for classification and shallow, layered neural networks for regression tasks.
Score: 5.22609266390809
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a modelling framework for the investigation of supervised learning in non-stationary environments. Specifically, we model two example types of learning systems: prototype-based Learning Vector Quantization (LVQ) for classification and shallow, layered neural networks for regression tasks. We investigate so-called student teacher scenarios in which the systems are trained from a stream of high-dimensional, labeled data. Properties of the target task are considered to be non-stationary due to drift processes while the training is performed. Different types of concept drift are studied, which affect the density of example inputs only, the target rule itself, or both. By applying methods from statistical physics, we develop a modelling framework for the mathematical analysis of the training dynamics in non-stationary environments. Our results show that standard LVQ algorithms are already suitable for the training in non-stationary environments to a certain extent. However, the application of weight decay as an explicit mechanism of forgetting does not improve the performance under the considered drift processes. Furthermore, we investigate gradient-based training of layered neural networks with sigmoidal activation functions and compare with the use of rectified linear units (ReLU). Our findings show that the sensitivity to concept drift and the effectiveness of weight decay differs significantly between the two types of activation function.

Related papers

KPFlow: An Operator Perspective on Dynamic Collapse Under Gradient Descent Training of Recurrent Networks [9.512147747894026]
We show how a gradient flow can be decomposed into a product that involves two operators.<n>We show how their interplay gives rise to low-dimensional latent dynamics under GD.<n>For multi-task training, we show that the operators can be used to measure how objectives relevant to individual sub-tasks align.
arXiv Detail & Related papers (2025-07-08T20:33:15Z)
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks. transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks. We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection. Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z)
Diffusion-Model-Assisted Supervised Learning of Generative Models for Density Estimation [10.793646707711442]
We present a framework for training generative models for density estimation. We use the score-based diffusion model to generate labeled data. Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner.
arXiv Detail & Related papers (2023-10-22T23:56:19Z)
Loss Dynamics of Temporal Difference Reinforcement Learning [36.772501199987076]
We study the case learning curves for temporal difference learning of a value function with linear function approximators. We study how learning dynamics and plateaus depend on feature structure, learning rate, discount factor, and reward function.
arXiv Detail & Related papers (2023-07-10T18:17:50Z)
Kalman Filter for Online Classification of Non-Stationary Data [101.26838049872651]
In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. We introduce a probabilistic Bayesian online learning model by using a neural representation and a state space model over the linear predictor weights. In experiments in multi-class classification we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
arXiv Detail & Related papers (2023-06-14T11:41:42Z)
Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior [32.74388989649232]
We study how pre-training could be used for scientific machine learning (SciML) applications. We find that fine-tuning these models yields more performance gains as model size increases.
arXiv Detail & Related papers (2023-06-01T00:32:59Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations [20.066631203802302]
We study the evolution of features during deep learning training using a set of differential equations (SDEs) that each corresponds to a training sample. Our results shed light on the decisive role of local elasticity in the training dynamics of neural networks.
arXiv Detail & Related papers (2021-10-11T17:17:20Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Automatic Recall Machines: Internal Replay, Continual Learning and the Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective. Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
Gradients as Features for Deep Representation Learning [26.996104074384263]
We address the problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks. Our key innovation is the design of a linear model that incorporates both gradient and activation of the pre-trained network. We present an efficient algorithm for the training and inference of our model without computing the actual gradient.
arXiv Detail & Related papers (2020-04-12T02:57:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.