Behind the Scenes: An Exploration of Trigger Biases Problem in Few-Shot
Event Classification
- URL: http://arxiv.org/abs/2108.12844v1
- Date: Sun, 29 Aug 2021 13:46:42 GMT
- Title: Behind the Scenes: An Exploration of Trigger Biases Problem in Few-Shot
Event Classification
- Authors: Peiyi Wang, Runxin Xu, Tianyu Liu, Damai Dai, Baobao Chang, and
Zhifang Sui
- Abstract summary: Few-Shot Event Classification (FSEC) aims at developing a model for event prediction, which can generalize to new event types with a limited number of annotated data.
We find existing FSEC models suffer from trigger biases that signify the statistical homogeneity between some trigger words and target event types.
To cope with the context-bypassing problem in FSEC models, we introduce adversarial training and trigger reconstruction techniques.
- Score: 24.598938900747186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-Shot Event Classification (FSEC) aims at developing a model for event
prediction, which can generalize to new event types with a limited number of
annotated data. Existing FSEC studies have achieved high accuracy on different
benchmarks. However, we find they suffer from trigger biases that signify the
statistical homogeneity between some trigger words and target event types,
which we summarize as trigger overlapping and trigger separability. The biases
can result in context-bypassing problem, i.e., correct classifications can be
gained by looking at only the trigger words while ignoring the entire context.
Therefore, existing models can be weak in generalizing to unseen data in real
scenarios. To further uncover the trigger biases and assess the generalization
ability of the models, we propose two new sampling methods, Trigger-Uniform
Sampling (TUS) and COnfusion Sampling (COS), for the meta tasks construction
during evaluation. Besides, to cope with the context-bypassing problem in FSEC
models, we introduce adversarial training and trigger reconstruction
techniques. Experiments show these techniques help not only improve the
performance, but also enhance the generalization ability of models.
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event
Detection [16.98619925632727]
Event detection (ED) aims to identify the key trigger words in unstructured text and predict the event types accordingly.
Traditional ED models are too data-hungry to accommodate real applications with scarce labeled data.
We propose a multi-step prompt learning model (MsPrompt) for debiasing few-shot event detection.
arXiv Detail & Related papers (2023-05-16T10:19:12Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - HCL-TAT: A Hybrid Contrastive Learning Method for Few-shot Event
Detection with Task-Adaptive Threshold [18.165302114575212]
We propose a novel Hybrid Contrastive Learning method with a Task-Adaptive Threshold (abbreviated as HCLTAT)
In this paper, we propose a novel Hybrid Contrastive Learning method with a Task-Adaptive Threshold (abbreviated as HCLTAT), which enables discriminative representation learning with a two-view contrastive loss.
Experiments on the benchmark dataset FewEvent demonstrate the superiority of our method to achieve better results compared to the state-of-the-arts.
arXiv Detail & Related papers (2022-10-17T07:37:38Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Counterfactual Adversarial Learning with Representation Interpolation [11.843735677432166]
We introduce Counterfactual Adrial Training framework to tackle the problem from aversa causality perspective.
Experiments demonstrate that CAT achieves substantial performance improvement over SOTA across different downstream tasks.
arXiv Detail & Related papers (2021-09-10T09:23:08Z) - Few-Shot Event Detection with Prototypical Amortized Conditional Random
Field [8.782210889586837]
Event Detection tends to struggle when it needs to recognize novel event types with a few samples.
We present a novel unified joint model which converts the task to a few-shot tagging problem with a double-part tagging scheme.
We conduct experiments on the benchmark dataset FewEvent and the experimental results show that the tagging based methods are better than existing pipeline and joint learning methods.
arXiv Detail & Related papers (2020-12-04T01:11:13Z) - Asymptotic Behavior of Adversarial Training in Binary Classification [41.7567932118769]
Adversarial training is considered to be the state-of-the-art method for defense against adversarial attacks.
Despite being successful in practice, several problems in understanding performance of adversarial training remain open.
We derive precise theoretical predictions for the minimization of adversarial training in binary classification.
arXiv Detail & Related papers (2020-10-26T01:44:20Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.