Causal Interventions-based Few-Shot Named Entity Recognition
- URL: http://arxiv.org/abs/2305.01914v1
- Date: Wed, 3 May 2023 06:11:39 GMT
- Title: Causal Interventions-based Few-Shot Named Entity Recognition
- Authors: Zhen Yang, Yongbin Liu, Chunping Ouyang
- Abstract summary: Few-shot named entity recognition (NER) systems aims at recognizing new classes of entities based on a few labeled samples.
The heavy overfitting in few-shot learning is mainly led by spurious correlation caused by the few samples selection bias.
We propose a causal intervention-based few-shot NER method to alleviate the problem of the spurious correlation.
- Score: 5.961427870758681
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot named entity recognition (NER) systems aims at recognizing new
classes of entities based on a few labeled samples. A significant challenge in
the few-shot regime is prone to overfitting than the tasks with abundant
samples. The heavy overfitting in few-shot learning is mainly led by spurious
correlation caused by the few samples selection bias. To alleviate the problem
of the spurious correlation in the few-shot NER, in this paper, we propose a
causal intervention-based few-shot NER method. Based on the prototypical
network, the method intervenes in the context and prototype via backdoor
adjustment during training. In particular, intervening in the context of the
one-shot scenario is very difficult, so we intervene in the prototype via
incremental learning, which can also avoid catastrophic forgetting. Our
experiments on different benchmarks show that our approach achieves new
state-of-the-art results (achieving up to 29% absolute improvement and 12% on
average for all tasks).
Related papers
- Towards Fast and Stable Federated Learning: Confronting Heterogeneity
via Knowledge Anchor [18.696420390977863]
This paper systematically analyzes the forgetting degree of each class during local training across different communication rounds.
Motivated by these findings, we propose a novel and straightforward algorithm called Federated Knowledge Anchor (FedKA)
arXiv Detail & Related papers (2023-12-05T01:12:56Z) - Fast Hierarchical Learning for Few-Shot Object Detection [57.024072600597464]
Transfer learning approaches have recently achieved promising results on the few-shot detection task.
These approaches suffer from catastrophic forgetting'' issue due to finetuning of base detector.
We tackle the aforementioned issues in this work.
arXiv Detail & Related papers (2022-10-10T20:31:19Z) - Few-shot Forgery Detection via Guided Adversarial Interpolation [56.59499187594308]
Existing forgery detection methods suffer from significant performance drops when applied to unseen novel forgery approaches.
We propose Guided Adversarial Interpolation (GAI) to overcome the few-shot forgery detection problem.
Our method is validated to be robust to choices of majority and minority forgery approaches.
arXiv Detail & Related papers (2022-04-12T16:05:10Z) - An Investigation of Replay-based Approaches for Continual Learning [79.0660895390689]
Continual learning (CL) is a major challenge of machine learning (ML) and describes the ability to learn several tasks sequentially without catastrophic forgetting (CF)
Several solution classes have been proposed, of which so-called replay-based approaches seem very promising due to their simplicity and robustness.
We empirically investigate replay-based approaches of continual learning and assess their potential for applications.
arXiv Detail & Related papers (2021-08-15T15:05:02Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances.
We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD.
MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.