LMVE at SemEval-2020 Task 4: Commonsense Validation and Explanation
using Pretraining Language Model
- URL: http://arxiv.org/abs/2007.02540v1
- Date: Mon, 6 Jul 2020 05:51:10 GMT
- Title: LMVE at SemEval-2020 Task 4: Commonsense Validation and Explanation
using Pretraining Language Model
- Authors: Shilei Liu, Yu Guo, Bochao Li and Feiliang Ren
- Abstract summary: This paper describes our submission to subtask a and b of SemEval-2020 Task 4.
For subtask a, we use a ALBERT based model with improved input form to pick out the common sense statement from two statement candidates.
For subtask b, we use a multiple choice model enhanced by hint sentence mechanism to select the reason from given options about why a statement is against common sense.
- Score: 5.428461405329692
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes our submission to subtask a and b of SemEval-2020 Task
4. For subtask a, we use a ALBERT based model with improved input form to pick
out the common sense statement from two statement candidates. For subtask b, we
use a multiple choice model enhanced by hint sentence mechanism to select the
reason from given options about why a statement is against common sense.
Besides, we propose a novel transfer learning strategy between subtasks which
help improve the performance. The accuracy scores of our system are 95.6 / 94.9
on official test set and rank 7$^{th}$ / 2$^{nd}$ on Post-Evaluation
leaderboard.
Related papers
- Prompt Algebra for Task Composition [131.97623832435812]
We consider Visual Language Models with prompt tuning as our base classifier.
We propose constrained prompt tuning to improve performance of the composite classifier.
On UTZappos it improves classification accuracy over the best base model by 8.45% on average.
arXiv Detail & Related papers (2023-06-01T03:20:54Z) - Toward Efficient Language Model Pretraining and Downstream Adaptation
via Self-Evolution: A Case Study on SuperGLUE [203.65227947509933]
This report describes our JDExplore d-team's Vega v2 submission on the SuperGLUE leaderboard.
SuperGLUE is more challenging than the widely used general language understanding evaluation (GLUE) benchmark, containing eight difficult language understanding tasks.
arXiv Detail & Related papers (2022-12-04T15:36:18Z) - Findings of the WMT 2022 Shared Task on Translation Suggestion [63.457874930232926]
We report the result of the first edition of the WMT shared task on Translation Suggestion.
The task aims to provide alternatives for specific words or phrases given the entire documents generated by machine translation (MT)
It consists two sub-tasks, namely, the naive translation suggestion and translation suggestion with hints.
arXiv Detail & Related papers (2022-11-30T03:48:36Z) - ISCAS at SemEval-2020 Task 5: Pre-trained Transformers for
Counterfactual Statement Modeling [48.3669727720486]
ISCAS participated in two subtasks of SemEval 2020 Task 5: detecting counterfactual statements and detecting antecedent and consequence.
This paper describes our system which is based on pre-trained transformers.
arXiv Detail & Related papers (2020-09-17T09:28:07Z) - QiaoNing at SemEval-2020 Task 4: Commonsense Validation and Explanation
system based on ensemble of language model [2.728575246952532]
In this paper, we present language model system submitted to SemEval-2020 Task 4 competition: "Commonsense Validation and Explanation"
We implemented with transfer learning using pretrained language models (BERT, XLNet, RoBERTa, and ALBERT) and fine-tune them on this task.
The ensembled model better solves this problem, making the model's accuracy reached 95.9% on subtask A, which just worse than human's by only 3% accuracy.
arXiv Detail & Related papers (2020-09-06T05:12:50Z) - CS-NET at SemEval-2020 Task 4: Siamese BERT for ComVE [2.0491741153610334]
This paper describes a system for distinguishing between statements that confirm to common sense and those that do not.
We use a parallel instance of transformers, which is responsible for a boost in the performance.
We achieved an accuracy of 94.8% in subtask A and 89% in subtask B on the test set.
arXiv Detail & Related papers (2020-07-21T14:08:02Z) - IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template
Reconstruction Strategy for ComVE [13.334749848189826]
We formalize the subtasks into the multiple-choice question answering format and construct the input with the prompt templates.
Experimental results show that our approaches achieve significant performance compared with the baseline systems.
Our approaches secure the third rank on both official test sets of the first two subtasks with an accuracy of 96.4 and an accuracy of 94.3 respectively.
arXiv Detail & Related papers (2020-07-02T06:59:53Z) - CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and
Prediction with Multi-task Learning [22.534520584497503]
This paper describes our system submitted to task 4 of SemEval 2020: Commonsense Validation and Explanation (ComVE)
The task is to directly validate the given sentence whether or not it makes sense and require the model to explain it.
Based on BERTarchitecture with a multi-task setting, we propose an effective and interpretable "Explain, Reason and Predict" (ERP) system.
arXiv Detail & Related papers (2020-06-12T13:51:12Z) - L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system.
A novel $L2R2$ approach is proposed under the learning-to-rank framework.
Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z) - CS-NLP team at SemEval-2020 Task 4: Evaluation of State-of-the-art NLP
Deep Learning Architectures on Commonsense Reasoning Task [3.058685580689605]
We describe our attempt at SemEval-2020 Task 4 competition: Commonsense Validation and Explanation (ComVE) challenge.
Our system uses prepared labeled textual datasets that were manually curated for three different natural language inference subtasks.
For the second subtask, which is to select the reason why a statement does not make sense, we stand within the first six teams (93.7%) among 27 participants with very competitive results.
arXiv Detail & Related papers (2020-05-17T13:20:10Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.