Online Distillation with Continual Learning for Cyclic Domain Shifts
- URL: http://arxiv.org/abs/2304.01239v1
- Date: Mon, 3 Apr 2023 11:15:05 GMT
- Title: Online Distillation with Continual Learning for Cyclic Domain Shifts
- Authors: Joachim Houyon, Anthony Cioppa, Yasir Ghunaim, Motasem Alfarra,
Ana\"is Halin, Maxim Henry, Bernard Ghanem, Marc Van Droogenbroeck
- Abstract summary: We propose a solution by leveraging the power of continual learning methods to reduce the impact of domain shifts.
Our work represents an important step forward in the field of online distillation and continual learning, with the potential to significantly impact real-world applications.
- Score: 52.707212371912476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, online distillation has emerged as a powerful technique for
adapting real-time deep neural networks on the fly using a slow, but accurate
teacher model. However, a major challenge in online distillation is
catastrophic forgetting when the domain shifts, which occurs when the student
model is updated with data from the new domain and forgets previously learned
knowledge. In this paper, we propose a solution to this issue by leveraging the
power of continual learning methods to reduce the impact of domain shifts.
Specifically, we integrate several state-of-the-art continual learning methods
in the context of online distillation and demonstrate their effectiveness in
reducing catastrophic forgetting. Furthermore, we provide a detailed analysis
of our proposed solution in the case of cyclic domain shifts. Our experimental
results demonstrate the efficacy of our approach in improving the robustness
and accuracy of online distillation, with potential applications in domains
such as video surveillance or autonomous driving. Overall, our work represents
an important step forward in the field of online distillation and continual
learning, with the potential to significantly impact real-world applications.
Related papers
- Evaluating the Effectiveness of Video Anomaly Detection in the Wild: Online Learning and Inference for Real-world Deployment [2.1374208474242815]
Video Anomaly Detection (VAD) identifies unusual activities in video streams, a key technology with broad applications ranging from surveillance to healthcare.
Tackling VAD in real-life settings poses significant challenges due to the dynamic nature of human actions, environmental variations, and domain shifts.
Online learning is a potential strategy to mitigate this issue by allowing models to adapt to new information continuously.
arXiv Detail & Related papers (2024-04-29T14:47:32Z) - Improving Online Continual Learning Performance and Stability with
Temporal Ensembles [30.869268130955145]
We study the effect of model ensembling as a way to improve performance and stability in online continual learning.
We use a lightweight temporal ensemble that computes the exponential moving average of the weights (EMA) at test time.
arXiv Detail & Related papers (2023-06-29T09:53:24Z) - On effects of Knowledge Distillation on Transfer Learning [0.0]
We propose a machine learning architecture we call TL+KD that combines knowledge distillation with transfer learning.
We show that using guidance and knowledge from a larger teacher network during fine-tuning, we can improve the student network to achieve better validation performances like accuracy.
arXiv Detail & Related papers (2022-10-18T08:11:52Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Recursive Least-Squares Estimator-Aided Online Learning for Visual
Tracking [58.14267480293575]
We propose a simple yet effective online learning approach for few-shot online adaptation without requiring offline training.
It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before.
We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP.
arXiv Detail & Related papers (2021-12-28T06:51:18Z) - Online Adversarial Distillation for Graph Neural Networks [40.746598033413086]
Knowledge distillation is a technique to improve the model generalization ability on convolutional neural networks.
In this paper, we propose an online adversarial distillation approach to train a group of graph neural networks.
arXiv Detail & Related papers (2021-12-28T02:30:11Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Domain Adaptive Knowledge Distillation for Driving Scene Semantic
Segmentation [9.203485172547824]
We present a novel approach to learn domain adaptive knowledge in models with limited memory.
We propose a multi-level distillation strategy to effectively distil knowledge at different levels.
We carry out extensive experiments and ablation studies on real-to-real as well as synthetic-to-real scenarios.
arXiv Detail & Related papers (2020-11-03T03:01:09Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.