Distilling Causal Effect from Miscellaneous Other-Class for Continual
Named Entity Recognition
- URL: http://arxiv.org/abs/2210.03980v1
- Date: Sat, 8 Oct 2022 09:37:06 GMT
- Title: Distilling Causal Effect from Miscellaneous Other-Class for Continual
Named Entity Recognition
- Authors: Junhao Zheng, Zhanxian Liang, Haibin Chen, Qianli Ma
- Abstract summary: Learning Other-Class in the same way as new entity types amplifies the catastrophic forgetting and leads to a substantial performance drop.
We propose a unified causal framework to retrieve the causality from both new entity types and Other-Class.
Experimental results on three benchmark datasets show that our method outperforms the state-of-the-art method by a large margin.
- Score: 23.25929285468311
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual Learning for Named Entity Recognition (CL-NER) aims to learn a
growing number of entity types over time from a stream of data. However, simply
learning Other-Class in the same way as new entity types amplifies the
catastrophic forgetting and leads to a substantial performance drop. The main
cause behind this is that Other-Class samples usually contain old entity types,
and the old knowledge in these Other-Class samples is not preserved properly.
Thanks to the causal inference, we identify that the forgetting is caused by
the missing causal effect from the old data. To this end, we propose a unified
causal framework to retrieve the causality from both new entity types and
Other-Class. Furthermore, we apply curriculum learning to mitigate the impact
of label noise and introduce a self-adaptive weight for balancing the causal
effects between new entity types and Other-Class. Experimental results on three
benchmark datasets show that our method outperforms the state-of-the-art method
by a large margin. Moreover, our method can be combined with the existing
state-of-the-art methods to improve the performance in CL-NER
Related papers
- Balancing the Causal Effects in Class-Incremental Learning [23.35478989162079]
Class-Incremental Learning (CIL) is a practical and challenging problem for achieving general artificial intelligence.
We show that the crux lies in the imbalanced causal effects between new and old data.
We propose Balancing the Causal Effects (BaCE) in CIL to alleviate this problem.
arXiv Detail & Related papers (2024-02-15T16:30:45Z) - Understanding the Detrimental Class-level Effects of Data Augmentation [63.1733767714073]
achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet.
We present a framework for understanding how DA interacts with class-level learning dynamics.
We show that simple class-conditional augmentation strategies improve performance on the negatively affected classes.
arXiv Detail & Related papers (2023-12-07T18:37:43Z) - Continual Named Entity Recognition without Catastrophic Forgetting [37.316700599440935]
We introduce a pooled feature distillation loss that skillfully navigates the trade-off between retaining knowledge of old entity types and acquiring new ones.
We develop a confidence-based pseudo-labeling for the non-entity type.
We suggest an adaptive re-weighting type-balanced learning strategy to handle the issue of biased type distribution.
arXiv Detail & Related papers (2023-10-23T03:45:30Z) - Class-Incremental Learning using Diffusion Model for Distillation and
Replay [5.0977390531431634]
Class-incremental learning aims to learn new classes in an incremental fashion without forgetting the previously learned ones.
We propose the use of a pretrained Stable Diffusion model as a source of additional data for class-incremental learning.
arXiv Detail & Related papers (2023-06-30T11:23:49Z) - Learning "O" Helps for Learning More: Handling the Concealed Entity
Problem for Class-incremental NER [23.625741716498037]
"Unlabeled Entity Problem" leads to severe confusion between "O" and entities.
We propose an entity-aware contrastive learning method that adaptively detects entity clusters in "O"
We introduce a more realistic and challenging benchmark for class-incremental NER.
arXiv Detail & Related papers (2022-10-10T13:26:45Z) - Bridging Non Co-occurrence with Unlabeled In-the-wild Data for
Incremental Object Detection [56.22467011292147]
Several incremental learning methods are proposed to mitigate catastrophic forgetting for object detection.
Despite the effectiveness, these methods require co-occurrence of the unlabeled base classes in the training data of the novel classes.
We propose the use of unlabeled in-the-wild data to bridge the non-occurrence caused by the missing base classes during the training of additional novel classes.
arXiv Detail & Related papers (2021-10-28T10:57:25Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Distilling Causal Effect of Data in Class-Incremental Learning [109.680987556265]
We propose a causal framework to explain the catastrophic forgetting in Class-Incremental Learning (CIL)
We derive a novel distillation method that is to mitigate to the existing anti-forgetting techniques, such as data replay and feature/label distillation.
arXiv Detail & Related papers (2021-03-02T14:14:10Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Class-incremental Learning with Rectified Feature-Graph Preservation [24.098892115785066]
A central theme of this paper is to learn new classes that arrive in sequential phases over time.
We propose a weighted-Euclidean regularization for old knowledge preservation.
We show how it can work with binary cross-entropy to increase class separation for effective learning of new classes.
arXiv Detail & Related papers (2020-12-15T07:26:04Z) - Long-Tailed Classification by Keeping the Good and Removing the Bad
Momentum Causal Effect [95.37587481952487]
Long-tailed classification is the key to deep learning at scale.
Existing methods are mainly based on re-weighting/resamplings that lack a fundamental theory.
In this paper, we establish a causal inference framework, which not only unravels the whys of previous methods, but also derives a new principled solution.
arXiv Detail & Related papers (2020-09-28T00:32:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.