Qtrade AI at SemEval-2022 Task 11: An Unified Framework for Multilingual
NER Task
- URL: http://arxiv.org/abs/2204.07459v1
- Date: Thu, 14 Apr 2022 07:51:36 GMT
- Title: Qtrade AI at SemEval-2022 Task 11: An Unified Framework for Multilingual
NER Task
- Authors: Weichao Gan, Yuanping Lin, Guangbo Yu, Guimin Chen and Qian Ye
- Abstract summary: This paper describes our system, which placed third in the Multilingual Track (subtask 11), fourth in the Code-Mixed Track (subtask 12), and seventh in the Chinese Track (subtask 9)
Our system's key contributions are as follows: 1) For multilingual NER tasks, we offer an unified framework with which one can easily execute single-language or multilingual NER tasks, 2) for low-resource code-mixed NER task, one can easily enhance his or her dataset through implementing several simple data augmentation methods, and 3) for Chinese tasks, we propose a model that can capture Chinese lexical semantic, lexical border
- Score: 10.167123492952694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes our system, which placed third in the Multilingual Track
(subtask 11), fourth in the Code-Mixed Track (subtask 12), and seventh in the
Chinese Track (subtask 9) in the SemEval 2022 Task 11: MultiCoNER Multilingual
Complex Named Entity Recognition. Our system's key contributions are as
follows: 1) For multilingual NER tasks, we offer an unified framework with
which one can easily execute single-language or multilingual NER tasks, 2) for
low-resource code-mixed NER task, one can easily enhance his or her dataset
through implementing several simple data augmentation methods and 3) for
Chinese tasks, we propose a model that can capture Chinese lexical semantic,
lexical border, and lexical graph structural information. Finally, our system
achieves macro-f1 scores of 77.66, 84.35, and 74.00 on subtasks 11, 12, and 9,
respectively, during the testing phase.
Related papers
- SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text? [4.463184061618504]
We present our submission to the SemEval-2024 Task 8 "Multigenerator, Multidomain, and Black-Box Machine-Generated Text Detection"
Our approach relies on combining embeddings from the RoBERTa-base with diversity features and uses a resampled training set.
Our results show that our approach is generalizable across unseen models and domains, achieving an accuracy of 0.91.
arXiv Detail & Related papers (2024-04-08T13:05:02Z) - MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness [5.91695168183101]
This paper presents the MasonTigers entry to the SemEval-2024 Task 1 - Semantic Textual Relatedness.
The task encompasses supervised (Track A), unsupervised (Track B), and cross-lingual (Track C) approaches across 14 different languages.
Our approaches achieved rankings ranging from 11th to 21st in Track A, from 1st to 8th in Track B, and from 5th to 12th in Track C.
arXiv Detail & Related papers (2024-03-22T06:47:42Z) - UMBCLU at SemEval-2024 Task 1A and 1C: Semantic Textual Relatedness with and without machine translation [0.09208007322096534]
The aim of SemEval-2024 Task 1 is to develop models for identifying semantic textual relatedness between two sentences.
We develop two STR models, $textitTranSem$ and $textitFineSem$, for the supervised and cross-lingual settings.
arXiv Detail & Related papers (2024-02-20T05:46:29Z) - UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Efficient Dialogue State Tracking by Masked Hierarchical Transformer [0.3441021278275805]
We build a Cross-lingual dialog state tracker with a training set in rich resource language and a testing set in low resource language.
We formulate a method for joint learning of slot operation classification task and state tracking task.
arXiv Detail & Related papers (2021-06-28T07:35:49Z) - N-LTP: An Open-source Neural Language Technology Platform for Chinese [68.58732970171747]
textttN- is an open-source neural language technology platform supporting six fundamental Chinese NLP tasks.
textttN- adopts the multi-task framework by using a shared pre-trained model, which has the advantage of capturing the shared knowledge across relevant Chinese tasks.
arXiv Detail & Related papers (2020-09-24T11:45:39Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z) - XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
Understanding and Generation [100.09099800591822]
XGLUE is a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models.
XGLUE provides 11 diversified tasks that cover both natural language understanding and generation scenarios.
arXiv Detail & Related papers (2020-04-03T07:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.