Iterative Network Pruning with Uncertainty Regularization for Lifelong
Sentiment Classification
- URL: http://arxiv.org/abs/2106.11197v1
- Date: Mon, 21 Jun 2021 15:34:13 GMT
- Title: Iterative Network Pruning with Uncertainty Regularization for Lifelong
Sentiment Classification
- Authors: Binzong Geng, Min Yang, Fajie Yuan, Shupeng Wang, Xiang Ao, Ruifeng Xu
- Abstract summary: Lifelong learning is non-trivial for deep neural networks.
We propose a novel iterative network pruning with uncertainty regularization method for lifelong sentiment classification.
- Score: 25.13885692629219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lifelong learning capabilities are crucial for sentiment classifiers to
process continuous streams of opinioned information on the Web. However,
performing lifelong learning is non-trivial for deep neural networks as
continually training of incrementally available information inevitably results
in catastrophic forgetting or interference. In this paper, we propose a novel
iterative network pruning with uncertainty regularization method for lifelong
sentiment classification (IPRLS), which leverages the principles of network
pruning and weight regularization. By performing network pruning with
uncertainty regularization in an iterative manner, IPRLS can adapta single BERT
model to work with continuously arriving data from multiple domains while
avoiding catastrophic forgetting and interference. Specifically, we leverage an
iterative pruning method to remove redundant parameters in large deep networks
so that the freed-up space can then be employed to learn new tasks, tackling
the catastrophic forgetting problem. Instead of keeping the old-tasks fixed
when learning new tasks, we also use an uncertainty regularization based on the
Bayesian online learning framework to constrain the update of old tasks weights
in BERT, which enables positive backward transfer, i.e. learning new tasks
improves performance on past tasks while protecting old knowledge from being
lost. In addition, we propose a task-specific low-dimensional residual function
in parallel to each layer of BERT, which makes IPRLS less prone to losing the
knowledge saved in the base BERT network when learning a new task. Extensive
experiments on 16 popular review corpora demonstrate that the proposed IPRLS
method sig-nificantly outperforms the strong baselines for lifelong sentiment
classification. For reproducibility, we submit the code and data
at:https://github.com/siat-nlp/IPRLS.
Related papers
- Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning [22.13331870720021]
We propose a beyond prompt learning approach to the RFCL task, called Continual Adapter (C-ADA)
C-ADA flexibly extends specific weights in CAL to learn new knowledge for each task and freezes old weights to preserve prior knowledge.
Our approach achieves significantly improved performance and training speed, outperforming the current state-of-the-art (SOTA) method.
arXiv Detail & Related papers (2024-07-14T17:40:40Z) - Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class
Incremental Learning [64.14254712331116]
Non-exemplar class incremental learning aims to learn both the new and old tasks without accessing any training data from the past.
We propose a novel framework of fine-grained knowledge selection and restoration.
arXiv Detail & Related papers (2023-12-20T02:34:11Z) - Negotiated Representations to Prevent Forgetting in Machine Learning
Applications [0.0]
Catastrophic forgetting is a significant challenge in the field of machine learning.
We propose a novel method for preventing catastrophic forgetting in machine learning applications.
arXiv Detail & Related papers (2023-11-30T22:43:50Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - CLR: Channel-wise Lightweight Reprogramming for Continual Learning [63.94773340278971]
Continual learning aims to emulate the human ability to continually accumulate knowledge over sequential tasks.
The main challenge is to maintain performance on previously learned tasks after learning new tasks.
We propose a Channel-wise Lightweight Reprogramming approach that helps convolutional neural networks overcome catastrophic forgetting.
arXiv Detail & Related papers (2023-07-21T06:56:21Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - OER: Offline Experience Replay for Continual Offline Reinforcement Learning [25.985985377992034]
Continuously learning new skills via a sequence of pre-collected offline datasets is desired for an agent.
In this paper, we formulate a new setting, continual offline reinforcement learning (CORL), where an agent learns a sequence of offline reinforcement learning tasks.
We propose a new model-based experience selection scheme to build the replay buffer, where a transition model is learned to approximate the state distribution.
arXiv Detail & Related papers (2023-05-23T08:16:44Z) - Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study [68.75670223005716]
We find that pre-trained language models like BERT have a potential ability to learn sequentially, even without any sparse memory replay.
Our experiments reveal that BERT can actually generate high quality representations for previously learned tasks in a long term, under extremely sparse replay or even no replay.
arXiv Detail & Related papers (2023-03-02T09:03:43Z) - Beyond Not-Forgetting: Continual Learning with Backward Knowledge
Transfer [39.99577526417276]
In continual learning (CL) an agent can improve the learning performance of both a new task and old' tasks.
Most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks.
We propose a new CL method with Backward knowlEdge tRansfer (CUBER) for a fixed capacity neural network without data replay.
arXiv Detail & Related papers (2022-11-01T23:55:51Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Semantic Drift Compensation for Class-Incremental Learning [48.749630494026086]
Class-incremental learning of deep networks sequentially increases the number of classes to be classified.
We propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars.
arXiv Detail & Related papers (2020-04-01T13:31:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.