Enhancing Continual Relation Extraction via Classifier Decomposition
- URL: http://arxiv.org/abs/2305.04636v1
- Date: Mon, 8 May 2023 11:29:33 GMT
- Title: Enhancing Continual Relation Extraction via Classifier Decomposition
- Authors: Heming Xia, Peiyi Wang, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang
Sui
- Abstract summary: Continual relation extraction models aim at handling emerging new relations while avoiding forgetting old ones in the streaming data.
Most models only adopt a vanilla strategy when models first learn representations of new relations.
We propose a simple yet effective classifier decomposition framework that splits the last FFN layer into separated previous and current classifiers.
- Score: 30.88081408988638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual relation extraction (CRE) models aim at handling emerging new
relations while avoiding catastrophically forgetting old ones in the streaming
data. Though improvements have been shown by previous CRE studies, most of them
only adopt a vanilla strategy when models first learn representations of new
relations. In this work, we point out that there exist two typical biases after
training of this vanilla strategy: classifier bias and representation bias,
which causes the previous knowledge that the model learned to be shaded. To
alleviate those biases, we propose a simple yet effective classifier
decomposition framework that splits the last FFN layer into separated previous
and current classifiers, so as to maintain previous knowledge and encourage the
model to learn more robust representations at this training stage. Experimental
results on two standard benchmarks show that our proposed framework
consistently outperforms the state-of-the-art CRE models, which indicates that
the importance of the first training stage to CRE models may be underestimated.
Our code is available at https://github.com/hemingkx/CDec.
Related papers
- RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Mitigating Catastrophic Forgetting in Task-Incremental Continual
Learning with Adaptive Classification Criterion [50.03041373044267]
We propose a Supervised Contrastive learning framework with adaptive classification criterion for Continual Learning.
Experiments show that CFL achieves state-of-the-art performance and has a stronger ability to overcome compared with the classification baselines.
arXiv Detail & Related papers (2023-05-20T19:22:40Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Learning Robust Representations for Continual Relation Extraction via
Adversarial Class Augmentation [45.87125587600661]
Continual relation extraction (CRE) aims to continually learn new relations from a class-incremental data stream.
CRE model usually suffers from catastrophic forgetting problem, i.e., the performance of old relations seriously degrades when the model learns new relations.
To address this issue, we encourage the model to learn more precise and robust representations through a simple yet effective adversarial class augmentation mechanism.
arXiv Detail & Related papers (2022-10-10T08:50:48Z) - Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning [141.35105358670316]
We study the difference between a na"ively-trained initial-phase model and the oracle model.
We propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly.
Our CwD is simple to implement and easy to plug into existing methods.
arXiv Detail & Related papers (2021-12-09T07:20:32Z) - End-to-End Weak Supervision [15.125993628007972]
We propose an end-to-end approach for directly learning the downstream model.
We show improved performance over prior work in terms of end model performance on downstream test sets.
arXiv Detail & Related papers (2021-07-05T19:10:11Z) - Conterfactual Generative Zero-Shot Semantic Segmentation [16.684570608930983]
One of the popular zero-shot semantic segmentation methods is based on the generative model.
In this work, we consider counterfactual methods to avoid the confounder in the original model.
Our model is compared with baseline models on two real-world datasets.
arXiv Detail & Related papers (2021-06-11T13:01:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.