Supervised Learning in the Presence of Concept Drift: A modelling
framework
- URL: http://arxiv.org/abs/2005.10531v2
- Date: Sat, 27 Feb 2021 20:45:45 GMT
- Title: Supervised Learning in the Presence of Concept Drift: A modelling
framework
- Authors: Michiel Straat, Fthi Abadi, Zhuoyun Kan, Christina G\"opfert, Barbara
Hammer, Michael Biehl
- Abstract summary: We present a modelling framework for the investigation of supervised learning in non-stationary environments.
We model two example types of learning systems: prototype-based Learning Vector Quantization (LVQ) for classification and shallow, layered neural networks for regression tasks.
- Score: 5.22609266390809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a modelling framework for the investigation of supervised learning
in non-stationary environments. Specifically, we model two example types of
learning systems: prototype-based Learning Vector Quantization (LVQ) for
classification and shallow, layered neural networks for regression tasks. We
investigate so-called student teacher scenarios in which the systems are
trained from a stream of high-dimensional, labeled data. Properties of the
target task are considered to be non-stationary due to drift processes while
the training is performed. Different types of concept drift are studied, which
affect the density of example inputs only, the target rule itself, or both. By
applying methods from statistical physics, we develop a modelling framework for
the mathematical analysis of the training dynamics in non-stationary
environments.
Our results show that standard LVQ algorithms are already suitable for the
training in non-stationary environments to a certain extent. However, the
application of weight decay as an explicit mechanism of forgetting does not
improve the performance under the considered drift processes. Furthermore, we
investigate gradient-based training of layered neural networks with sigmoidal
activation functions and compare with the use of rectified linear units (ReLU).
Our findings show that the sensitivity to concept drift and the effectiveness
of weight decay differs significantly between the two types of activation
function.
Related papers
- MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - Diffusion-Model-Assisted Supervised Learning of Generative Models for
Density Estimation [10.793646707711442]
We present a framework for training generative models for density estimation.
We use the score-based diffusion model to generate labeled data.
Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner.
arXiv Detail & Related papers (2023-10-22T23:56:19Z) - Loss Dynamics of Temporal Difference Reinforcement Learning [36.772501199987076]
We study the case learning curves for temporal difference learning of a value function with linear function approximators.
We study how learning dynamics and plateaus depend on feature structure, learning rate, discount factor, and reward function.
arXiv Detail & Related papers (2023-07-10T18:17:50Z) - Kalman Filter for Online Classification of Non-Stationary Data [101.26838049872651]
In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps.
We introduce a probabilistic Bayesian online learning model by using a neural representation and a state space model over the linear predictor weights.
In experiments in multi-class classification we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
arXiv Detail & Related papers (2023-06-14T11:41:42Z) - Towards Foundation Models for Scientific Machine Learning:
Characterizing Scaling and Transfer Behavior [32.74388989649232]
We study how pre-training could be used for scientific machine learning (SciML) applications.
We find that fine-tuning these models yields more performance gains as model size increases.
arXiv Detail & Related papers (2023-06-01T00:32:59Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Imitating Deep Learning Dynamics via Locally Elastic Stochastic
Differential Equations [20.066631203802302]
We study the evolution of features during deep learning training using a set of differential equations (SDEs) that each corresponds to a training sample.
Our results shed light on the decisive role of local elasticity in the training dynamics of neural networks.
arXiv Detail & Related papers (2021-10-11T17:17:20Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Gradients as Features for Deep Representation Learning [26.996104074384263]
We address the problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks.
Our key innovation is the design of a linear model that incorporates both gradient and activation of the pre-trained network.
We present an efficient algorithm for the training and inference of our model without computing the actual gradient.
arXiv Detail & Related papers (2020-04-12T02:57:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.