Bilevel Online Deep Learning in Non-stationary Environment
- URL: http://arxiv.org/abs/2201.10236v1
- Date: Tue, 25 Jan 2022 11:05:51 GMT
- Title: Bilevel Online Deep Learning in Non-stationary Environment
- Authors: Ya-nan Han, Jian-wei Liu, Bing-biao Xiao, Xin-Tan Wang, Xiong-lin Luo
- Abstract summary: Bilevel Online Deep Learning (BODL) framework combines bilevel optimization strategy and online ensemble classifier.
When the concept drift is detected, our BODL algorithm can adaptively update the model parameters via bilevel optimization and then circumvent the large drift and encourage positive transfer.
- Score: 4.565872584112864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have witnessed enormous progress of online learning. However, a
major challenge on the road to artificial agents is concept drift, that is, the
data probability distribution would change where the data instance arrives
sequentially in a stream fashion, which would lead to catastrophic forgetting
and degrade the performance of the model. In this paper, we proposed a new
Bilevel Online Deep Learning (BODL) framework, which combine bilevel
optimization strategy and online ensemble classifier. In BODL algorithm, we use
an ensemble classifier, which use the output of different hidden layers in deep
neural network to build multiple base classifiers, the important weights of the
base classifiers are updated according to exponential gradient descent method
in an online manner. Besides, we apply the similar constraint to overcome the
convergence problem of online ensemble framework. Then an effective concept
drift detection mechanism utilizing the error rate of classifier is designed to
monitor the change of the data probability distribution. When the concept drift
is detected, our BODL algorithm can adaptively update the model parameters via
bilevel optimization and then circumvent the large drift and encourage positive
transfer. Finally, the extensive experiments and ablation studies are conducted
on various datasets and the competitive numerical results illustrate that our
BODL algorithm is a promising approach.
Related papers
- A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning.
Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation.
We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z) - Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling [46.01254613933967]
Online learning methods are still highly effective for high-dimensional streaming data, out-of-core processing, and other throughput-sensitive applications.
Many such algorithms rely on fast adaptation to individual errors as a key to their convergence.
While such algorithms enjoy low theoretical regret, in real-world deployment they can be sensitive to individual outliers that cause the algorithm to over-correct.
arXiv Detail & Related papers (2024-10-31T03:35:48Z) - Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Beyond Deep Ensembles: A Large-Scale Evaluation of Bayesian Deep
Learning under Distribution Shift [19.945634052291542]
We evaluate modern BDL algorithms on real-world datasets from the WILDS collection containing challenging classification and regression tasks.
We compare the algorithms on a wide range of large, convolutional and transformer-based neural network architectures.
We provide the first systematic evaluation of BDL for fine-tuning large pre-trained models.
arXiv Detail & Related papers (2023-06-21T14:36:03Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - A Differential Game Theoretic Neural Optimizer for Training Residual
Networks [29.82841891919951]
We propose a generalized Differential Dynamic Programming (DDP) neural architecture that accepts both residual connections and convolution layers.
The resulting optimal control representation admits a gameoretic perspective, in which training residual networks can be interpreted as cooperative trajectory optimization on state-augmented systems.
arXiv Detail & Related papers (2020-07-17T10:19:17Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.