ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
Model for Reading Comprehension of Abstract Meaning
- URL: http://arxiv.org/abs/2102.12828v1
- Date: Thu, 25 Feb 2021 13:03:05 GMT
- Title: ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
Model for Reading Comprehension of Abstract Meaning
- Authors: Xin Xie, Xiangnan Chen, Xiang Chen, Yong Wang, Ningyu Zhang, Shumin
Deng, Huajun Chen
- Abstract summary: We explain the algorithms used to learn our models and the process of tuning the algorithms and selecting the best model.
Inspired by the similarity of the ReCAM task and the language pre-training, we propose a simple yet effective technology, namely, negative augmentation with language model.
Our models achieve the 4th rank on both official test sets of Subtask 1 and Subtask 2 with an accuracy of 87.9% and an accuracy of 92.8%, respectively.
- Score: 16.151203366447962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents our systems for the three Subtasks of SemEval Task4:
Reading Comprehension of Abstract Meaning (ReCAM). We explain the algorithms
used to learn our models and the process of tuning the algorithms and selecting
the best model. Inspired by the similarity of the ReCAM task and the language
pre-training, we propose a simple yet effective technology, namely, negative
augmentation with language model. Evaluation results demonstrate the
effectiveness of our proposed approach. Our models achieve the 4th rank on both
official test sets of Subtask 1 and Subtask 2 with an accuracy of 87.9% and an
accuracy of 92.8%, respectively. We further conduct comprehensive model
analysis and observe interesting error cases, which may promote future
researches.
Related papers
- The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities.
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models.
Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z) - Large Language Models in the Workplace: A Case Study on Prompt
Engineering for Job Type Classification [58.720142291102135]
This case study investigates the task of job classification in a real-world setting.
The goal is to determine whether an English-language job posting is appropriate for a graduate or entry-level position.
arXiv Detail & Related papers (2023-03-13T14:09:53Z) - Toward Efficient Language Model Pretraining and Downstream Adaptation
via Self-Evolution: A Case Study on SuperGLUE [203.65227947509933]
This report describes our JDExplore d-team's Vega v2 submission on the SuperGLUE leaderboard.
SuperGLUE is more challenging than the widely used general language understanding evaluation (GLUE) benchmark, containing eight difficult language understanding tasks.
arXiv Detail & Related papers (2022-12-04T15:36:18Z) - Effective Cross-Task Transfer Learning for Explainable Natural Language
Inference with T5 [50.574918785575655]
We compare sequential fine-tuning with a model for multi-task learning in the context of boosting performance on two tasks.
Our results show that while sequential multi-task learning can be tuned to be good at the first of two target tasks, it performs less well on the second and additionally struggles with overfitting.
arXiv Detail & Related papers (2022-10-31T13:26:08Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Few-shot Learning with Multilingual Language Models [66.49496434282564]
We train multilingual autoregressive language models on a balanced corpus covering a diverse set of languages.
Our largest model sets new state of the art in few-shot learning in more than 20 representative languages.
We present a detailed analysis of where the model succeeds and fails, showing in particular that it enables cross-lingual in-context learning.
arXiv Detail & Related papers (2021-12-20T16:52:35Z) - ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for
Abstract Word Prediction [2.482368922343792]
We fine-tuned the pre-trained masked language models namely BERT and ALBERT.
We tried multiple approaches and found that Masked Language Modeling(MLM) based approach works the best.
arXiv Detail & Related papers (2021-04-04T08:22:19Z) - LRG at SemEval-2021 Task 4: Improving Reading Comprehension with
Abstract Words using Augmentation, Linguistic Features and Voting [0.6850683267295249]
Given a fill-in-the-blank-type question, the task is to predict the most suitable word from a list of 5 options.
We use encoders of transformers-based models pre-trained on the masked language modelling (MLM) task to build our Fill-in-the-blank (FitB) models.
We propose variants, namely Chunk Voting and Max Context, to take care of input length restrictions for BERT, etc.
arXiv Detail & Related papers (2021-02-24T12:33:12Z) - QiaoNing at SemEval-2020 Task 4: Commonsense Validation and Explanation
system based on ensemble of language model [2.728575246952532]
In this paper, we present language model system submitted to SemEval-2020 Task 4 competition: "Commonsense Validation and Explanation"
We implemented with transfer learning using pretrained language models (BERT, XLNet, RoBERTa, and ALBERT) and fine-tune them on this task.
The ensembled model better solves this problem, making the model's accuracy reached 95.9% on subtask A, which just worse than human's by only 3% accuracy.
arXiv Detail & Related papers (2020-09-06T05:12:50Z) - BUT-FIT at SemEval-2020 Task 4: Multilingual commonsense [1.433758865948252]
This paper describes work of the BUT-FIT's team at SemEval 2020 Task 4 - Commonsense Validation and Explanation.
In subtasks A and B, our submissions are based on pretrained language representation models (namely ALBERT) and data augmentation.
We experimented with solving the task for another language, Czech, by means of multilingual models and machine translated dataset.
We show that with a strong machine translation system, our system can be used in another language with a small accuracy loss.
arXiv Detail & Related papers (2020-08-17T12:45:39Z) - KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for
Comprehension And Generation [4.94950858749529]
We propose a novel way to search for evidence and choose the different large-scale pre-trained models as the backbone for three subtasks.
The results show that our evidence-searching approach improves model performance on commonsense explanation task.
arXiv Detail & Related papers (2020-05-24T15:09:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.