XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head
Co-Attention for Reading Comprehension of Abstract Meaning
- URL: http://arxiv.org/abs/2103.16102v1
- Date: Tue, 30 Mar 2021 06:22:58 GMT
- Title: XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head
Co-Attention for Reading Comprehension of Abstract Meaning
- Authors: Yuxin Jiang, Ziyi Shou, Qijun Wang, Hao Wu and Fangzhen Lin
- Abstract summary: This paper presents our submitted system to SemEval 2021 Task 4: Reading of Abstract Meaning.
Our system uses a large pre-trained language model as the encoder and an additional dual multi-head co-attention layer to strengthen the relationship between passages and question-answer pairs.
Our system, called WordNet-enhanced DUal Multi-head Co-Attention (WN-DUMA), achieves 86.67% and 89.99% accuracy on the official blind test set of subtask 1 and subtask 2 respectively.
- Score: 6.55600662108243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents our submitted system to SemEval 2021 Task 4: Reading
Comprehension of Abstract Meaning. Our system uses a large pre-trained language
model as the encoder and an additional dual multi-head co-attention layer to
strengthen the relationship between passages and question-answer pairs,
following the current state-of-the-art model DUMA. The main difference is that
we stack the passage-question and question-passage attention modules instead of
calculating parallelly to simulate re-considering process. We also add a layer
normalization module to improve the performance of our model. Furthermore, to
incorporate our known knowledge about abstract concepts, we retrieve the
definitions of candidate answers from WordNet and feed them to the model as
extra inputs. Our system, called WordNet-enhanced DUal Multi-head Co-Attention
(WN-DUMA), achieves 86.67% and 89.99% accuracy on the official blind test set
of subtask 1 and subtask 2 respectively.
Related papers
- Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Co-guiding for Multi-intent Spoken Language Understanding [53.30511968323911]
We propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks.
For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning.
Experiment results on multi-intent SLU show that our model outperforms existing models by a large margin.
arXiv Detail & Related papers (2023-11-22T08:06:22Z) - DOMINO: A Dual-System for Multi-step Visual Language Reasoning [76.69157235928594]
We propose a dual-system for multi-step multimodal reasoning, which consists of a "System-1" step for visual information extraction and a "System-2" step for deliberate reasoning.
Our method with a pre-trained System-2 module performs competitively compared to prior work on in- and out-of-distribution data.
arXiv Detail & Related papers (2023-10-04T13:29:47Z) - MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks [59.09343552273045]
We propose a decoder-only model for multimodal tasks, which is surprisingly effective in jointly learning of these disparate vision-language tasks.
We demonstrate that joint learning of these diverse objectives is simple, effective, and maximizes the weight-sharing of the model across these tasks.
Our model achieves the state of the art on image-text and text-image retrieval, video question answering and open-vocabulary detection tasks, outperforming much larger and more extensively trained foundational models.
arXiv Detail & Related papers (2023-03-29T16:42:30Z) - Neural Coreference Resolution based on Reinforcement Learning [53.73316523766183]
Coreference resolution systems need to solve two subtasks.
One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention.
We propose a reinforcement learning actor-critic-based neural coreference resolution system.
arXiv Detail & Related papers (2022-12-18T07:36:35Z) - JOIST: A Joint Speech and Text Streaming Model For ASR [63.15848310748753]
We present JOIST, an algorithm to train a streaming, cascaded, encoder end-to-end (E2E) model with both speech-text paired inputs, and text-only unpaired inputs.
We find that best text representation for JOIST improves WER across a variety of search and rare-word test sets by 4-14% relative, compared to a model not trained with text.
arXiv Detail & Related papers (2022-10-13T20:59:22Z) - MoCA: Incorporating Multi-stage Domain Pretraining and Cross-guided
Multimodal Attention for Textbook Question Answering [7.367945534481411]
We propose a novel model named MoCA, which incorporates multi-stage domain pretraining and multimodal cross attention for the Textbook Question Answering task.
The experimental results show the superiority of our model, which outperforms the state-of-the-art methods by 2.21% and 2.43% for validation and test split respectively.
arXiv Detail & Related papers (2021-12-06T07:58:53Z) - OCHADAI-KYODAI at SemEval-2021 Task 1: Enhancing Model Generalization
and Robustness for Lexical Complexity Prediction [8.066349353140819]
We propose an ensemble model for predicting the lexical complexity of words and multiword expressions.
The model receives as input a sentence with a target word or MWEand outputs its complexity score.
Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks.
arXiv Detail & Related papers (2021-05-12T09:27:46Z) - IIE-NLP-Eyas at SemEval-2021 Task 4: Enhancing PLM for ReCAM with
Special Tokens, Re-Ranking, Siamese Encoders and Back Translation [8.971288666318719]
This paper introduces our systems for all three subtasks of SemEval-2021 Task 4: Reading of Abstract Meaning.
We well-design many simple and effective approaches adapted to the backbone model (RoBERTa)
Experimental results show that our approaches achieve significant performance compared with the baseline systems.
arXiv Detail & Related papers (2021-02-25T10:51:48Z) - LRG at SemEval-2021 Task 4: Improving Reading Comprehension with
Abstract Words using Augmentation, Linguistic Features and Voting [0.6850683267295249]
Given a fill-in-the-blank-type question, the task is to predict the most suitable word from a list of 5 options.
We use encoders of transformers-based models pre-trained on the masked language modelling (MLM) task to build our Fill-in-the-blank (FitB) models.
We propose variants, namely Chunk Voting and Max Context, to take care of input length restrictions for BERT, etc.
arXiv Detail & Related papers (2021-02-24T12:33:12Z) - Hierarchical Multi Task Learning with Subword Contextual Embeddings for
Languages with Rich Morphology [5.5217350574838875]
Morphological information is important for many sequence labeling tasks in Natural Language Processing (NLP)
We propose using subword contextual embeddings to capture morphological information for languages with rich morphology.
Our model outperforms previous state-of-the-art models on both tasks for the Turkish language.
arXiv Detail & Related papers (2020-04-25T22:55:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.