Teacher Agent: A Knowledge Distillation-Free Framework for
Rehearsal-based Video Incremental Learning
- URL: http://arxiv.org/abs/2306.00393v3
- Date: Mon, 11 Dec 2023 03:11:12 GMT
- Title: Teacher Agent: A Knowledge Distillation-Free Framework for
Rehearsal-based Video Incremental Learning
- Authors: Shengqin Jiang, Yaoyu Fang, Haokui Zhang, Qingshan Liu, Yuankai Qi,
Yang Yang, Peng Wang
- Abstract summary: Rehearsal-based video incremental learning often employs knowledge distillation to mitigate catastrophic forgetting of previously learned data.
We propose a knowledge distillation-free framework for rehearsal-based video incremental learning called textitTeacher Agent.
- Score: 29.52218286906986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rehearsal-based video incremental learning often employs knowledge
distillation to mitigate catastrophic forgetting of previously learned data.
However, this method faces two major challenges for video task: substantial
computing resources from loading teacher model and limited replay capability
from performance-limited teacher model. To address these problems, we first
propose a knowledge distillation-free framework for rehearsal-based video
incremental learning called \textit{Teacher Agent}. Instead of loading
parameter-heavy teacher networks, we introduce an agent generator that is
either parameter-free or uses only a few parameters to obtain accurate and
reliable soft labels. This method not only greatly reduces the computing
requirement but also circumvents the problem of knowledge misleading caused by
inaccurate predictions of the teacher model. Moreover, we put forward a
self-correction loss which provides an effective regularization signal for the
review of old knowledge, which in turn alleviates the problem of catastrophic
forgetting. Further, to ensure that the samples in the memory buffer are
memory-efficient and representative, we introduce a unified sampler for
rehearsal-based video incremental learning to mine fixed-length key video
frames. Interestingly, based on the proposed strategies, the network exhibits a
high level of robustness against spatial resolution reduction when compared to
the baseline. Extensive experiments demonstrate the advantages of our method,
yielding significant performance improvements while utilizing only half the
spatial resolution of video clips as network inputs in the incremental phases.
Related papers
- Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation [64.15918654558816]
Self-distillation (SSD) training strategy is introduced for filtering and weighting teacher representation to distill from task-relevant representations only.
Experimental results on real-world affective computing, wearable/biosignal datasets from the UCR Archive, the HAR dataset, and image classification datasets show that the proposed SSD method can outperform state-of-the-art methods.
arXiv Detail & Related papers (2025-04-19T14:08:56Z) - Reducing Catastrophic Forgetting in Online Class Incremental Learning Using Self-Distillation [3.8506666685467343]
In continual learning, previous knowledge is forgotten when a model learns new tasks.
In this paper, we tried to solve this problem by acquiring transferable knowledge through self-distillation.
Our proposed method outperformed conventional methods by experiments on CIFAR10, CIFAR100, and MiniimageNet datasets.
arXiv Detail & Related papers (2024-09-17T16:26:33Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - The Staged Knowledge Distillation in Video Classification: Harmonizing
Student Progress by a Complementary Weakly Supervised Framework [21.494759678807686]
We propose a new weakly supervised learning framework for knowledge distillation in video classification.
Our approach leverages the concept of substage-based learning to distill knowledge based on the combination of student substages and the correlation of corresponding substages.
Our proposed substage-based distillation approach has the potential to inform future research on label-efficient learning for video data.
arXiv Detail & Related papers (2023-07-11T12:10:42Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - Contrastive Losses Are Natural Criteria for Unsupervised Video
Summarization [27.312423653997087]
Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing.
We propose three metrics featuring a desirable key frame: local dissimilarity, global consistency, and uniqueness.
We show that by refining the pre-trained features with a lightweight contrastively learned projection module, the frame-level importance scores can be further improved.
arXiv Detail & Related papers (2022-11-18T07:01:28Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Representation Memorization for Fast Learning New Knowledge without
Forgetting [36.55736909586313]
The ability to quickly learn new knowledge is a big step towards human-level intelligence.
We consider scenarios that require learning new classes or data distributions quickly and incrementally over time.
We propose "Memory-based Hebbian Adaptation" to tackle the two major challenges.
arXiv Detail & Related papers (2021-08-28T07:54:53Z) - Always Be Dreaming: A New Approach for Data-Free Class-Incremental
Learning [73.24988226158497]
We consider the high-impact problem of Data-Free Class-Incremental Learning (DFCIL)
We propose a novel incremental distillation strategy for DFCIL, contributing a modified cross-entropy training and importance-weighted feature distillation.
Our method results in up to a 25.1% increase in final task accuracy (absolute difference) compared to SOTA DFCIL methods for common class-incremental benchmarks.
arXiv Detail & Related papers (2021-06-17T17:56:08Z) - IB-DRR: Incremental Learning with Information-Back Discrete
Representation Replay [4.8666876477091865]
Incremental learning aims to enable machine learning models to continuously acquire new knowledge given new classes.
Saving a subset of training samples of previously seen classes in the memory and replaying them during new training phases is proven to be an efficient and effective way to fulfil this aim.
However, finding a trade-off between the model performance and the number of samples to save for each class is still an open problem for replay-based incremental learning.
arXiv Detail & Related papers (2021-04-21T15:32:11Z) - Remembering for the Right Reasons: Explanations Reduce Catastrophic
Forgetting [100.75479161884935]
We propose a novel training paradigm called Remembering for the Right Reasons (RRR)
RRR stores visual model explanations for each example in the buffer and ensures the model has "the right reasons" for its predictions.
We demonstrate how RRR can be easily added to any memory or regularization-based approach and results in reduced forgetting.
arXiv Detail & Related papers (2020-10-04T10:05:27Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.