CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and
Prediction with Multi-task Learning
- URL: http://arxiv.org/abs/2006.09161v2
- Date: Tue, 28 Jul 2020 00:34:47 GMT
- Title: CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and
Prediction with Multi-task Learning
- Authors: Hongru Wang and Xiangru Tang and Sunny Lai and Kwong Sak Leung and Jia
Zhu and Gabriel Pui Cheong Fung and Kam-Fai Wong
- Abstract summary: This paper describes our system submitted to task 4 of SemEval 2020: Commonsense Validation and Explanation (ComVE)
The task is to directly validate the given sentence whether or not it makes sense and require the model to explain it.
Based on BERTarchitecture with a multi-task setting, we propose an effective and interpretable "Explain, Reason and Predict" (ERP) system.
- Score: 22.534520584497503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes our system submitted to task 4 of SemEval 2020:
Commonsense Validation and Explanation (ComVE) which consists of three
sub-tasks. The task is to directly validate the given sentence whether or not
it makes sense and require the model to explain it. Based on BERTarchitecture
with a multi-task setting, we propose an effective and interpretable "Explain,
Reason and Predict" (ERP) system to solve the three sub-tasks about
commonsense: (a) Validation, (b)Reasoning, and (c) Explanation. Inspired by
cognitive studies of common sense, our system first generates a reason or
understanding of the sentences and then chooses which one statement makes
sense, which is achieved by multi-task learning. During the post-evaluation,
our system has reached 92.9% accuracy in subtask A (rank 11), 89.7% accuracy in
subtask B (rank 9), andBLEU score of 12.9 in subtask C (rank 8)
Related papers
- SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - Findings of the WMT 2022 Shared Task on Translation Suggestion [63.457874930232926]
We report the result of the first edition of the WMT shared task on Translation Suggestion.
The task aims to provide alternatives for specific words or phrases given the entire documents generated by machine translation (MT)
It consists two sub-tasks, namely, the naive translation suggestion and translation suggestion with hints.
arXiv Detail & Related papers (2022-11-30T03:48:36Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning [47.49596196559958]
This paper introduces the SemEval-2021 shared task 4: Reading of Abstract Meaning (ReCAM)
Given a passage and the corresponding question, a participating system is expected to choose the correct answer from five candidates of abstract concepts.
Subtask 1 aims to evaluate how well a system can model concepts that cannot be directly perceived in the physical world.
Subtask 2 focuses on models' ability in comprehending nonspecific concepts located high in a hypernym hierarchy.
Subtask 3 aims to provide some insights into models' generalizability over the two types of abstractness.
arXiv Detail & Related papers (2021-05-31T11:04:17Z) - QiaoNing at SemEval-2020 Task 4: Commonsense Validation and Explanation
system based on ensemble of language model [2.728575246952532]
In this paper, we present language model system submitted to SemEval-2020 Task 4 competition: "Commonsense Validation and Explanation"
We implemented with transfer learning using pretrained language models (BERT, XLNet, RoBERTa, and ALBERT) and fine-tune them on this task.
The ensembled model better solves this problem, making the model's accuracy reached 95.9% on subtask A, which just worse than human's by only 3% accuracy.
arXiv Detail & Related papers (2020-09-06T05:12:50Z) - CS-NET at SemEval-2020 Task 4: Siamese BERT for ComVE [2.0491741153610334]
This paper describes a system for distinguishing between statements that confirm to common sense and those that do not.
We use a parallel instance of transformers, which is responsible for a boost in the performance.
We achieved an accuracy of 94.8% in subtask A and 89% in subtask B on the test set.
arXiv Detail & Related papers (2020-07-21T14:08:02Z) - LMVE at SemEval-2020 Task 4: Commonsense Validation and Explanation
using Pretraining Language Model [5.428461405329692]
This paper describes our submission to subtask a and b of SemEval-2020 Task 4.
For subtask a, we use a ALBERT based model with improved input form to pick out the common sense statement from two statement candidates.
For subtask b, we use a multiple choice model enhanced by hint sentence mechanism to select the reason from given options about why a statement is against common sense.
arXiv Detail & Related papers (2020-07-06T05:51:10Z) - IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template
Reconstruction Strategy for ComVE [13.334749848189826]
We formalize the subtasks into the multiple-choice question answering format and construct the input with the prompt templates.
Experimental results show that our approaches achieve significant performance compared with the baseline systems.
Our approaches secure the third rank on both official test sets of the first two subtasks with an accuracy of 96.4 and an accuracy of 94.3 respectively.
arXiv Detail & Related papers (2020-07-02T06:59:53Z) - SemEval-2020 Task 4: Commonsense Validation and Explanation [24.389998904122244]
SemEval-2020 Task 4, Commonsense Validation and Explanation (ComVE), includes three subtasks.
We aim to evaluate whether a system can distinguish a natural language statement that makes sense to humans from one that does not.
For Subtask A and Subtask B, the performances of top-ranked systems are close to that of humans.
arXiv Detail & Related papers (2020-07-01T04:41:05Z) - A Simple Language Model for Task-Oriented Dialogue [61.84084939472287]
SimpleTOD is a simple approach to task-oriented dialogue that uses a single, causal language model trained on all sub-tasks recast as a single sequence prediction problem.
This allows SimpleTOD to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2.
arXiv Detail & Related papers (2020-05-02T11:09:27Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.