X-PuDu at SemEval-2022 Task 6: Multilingual Learning for English and
Arabic Sarcasm Detection
- URL: http://arxiv.org/abs/2211.16883v1
- Date: Wed, 30 Nov 2022 10:34:08 GMT
- Title: X-PuDu at SemEval-2022 Task 6: Multilingual Learning for English and
Arabic Sarcasm Detection
- Authors: Yaqian Han, Yekun Chai, Shuohuan Wang, Yu Sun, Hongyi Huang, Guanghao
Chen, Yitong Xu, Yang Yang
- Abstract summary: This paper describes the X-PuDu system that participated in SemEval-2022 Task 6, iSarcasmEval - Intended Sarcasm Detection in English and Arabic.
Our solution finetunes pre-trained language models, such as ERNIE-M and DeBERTa, under the multilingual settings to recognize the irony from Arabic and English texts.
- Score: 12.241656852060606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting sarcasm and verbal irony from people's subjective statements is
crucial to understanding their intended meanings and real sentiments and
positions in social scenarios. This paper describes the X-PuDu system that
participated in SemEval-2022 Task 6, iSarcasmEval - Intended Sarcasm Detection
in English and Arabic, which aims at detecting intended sarcasm in various
settings of natural language understanding. Our solution finetunes pre-trained
language models, such as ERNIE-M and DeBERTa, under the multilingual settings
to recognize the irony from Arabic and English texts. Our system ranked second
out of 43, and ninth out of 32 in Task A: one-sentence detection in English and
Arabic; fifth out of 22 in Task B: binary multi-label classification in
English; first out of 16, and fifth out of 13 in Task C: sentence-pair
detection in English and Arabic.
Related papers
- SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - SemEval 2024 -- Task 10: Emotion Discovery and Reasoning its Flip in
Conversation (EDiReF) [61.49972925493912]
SemEval-2024 Task 10 is a shared task centred on identifying emotions in code-mixed dialogues.
This task comprises three distinct subtasks - emotion recognition in conversation for code-mixed dialogues, emotion flip reasoning for code-mixed dialogues, and emotion flip reasoning for English dialogues.
A total of 84 participants engaged in this task, with the most adept systems attaining F1-scores of 0.70, 0.79, and 0.76 for the respective subtasks.
arXiv Detail & Related papers (2024-02-29T08:20:06Z) - ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic [51.922112625469836]
We present datasetname, the first multi-task language understanding benchmark for the Arabic language.
Our data comprises 40 tasks and 14,575 multiple-choice questions in Modern Standard Arabic (MSA) and is carefully constructed by collaborating with native speakers in the region.
Our evaluations of 35 models reveal substantial room for improvement, particularly among the best open-source models.
arXiv Detail & Related papers (2024-02-20T09:07:41Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question
Answering for 16 Diverse Languages [54.002969723086075]
We evaluate cross-lingual open-retrieval question answering systems in 16 typologically diverse languages.
The best system leveraging iteratively mined diverse negative examples achieves 32.2 F1, outperforming our baseline by 4.5 points.
The second best system uses entity-aware contextualized representations for document retrieval, and achieves significant improvements in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores.
arXiv Detail & Related papers (2022-07-02T06:54:10Z) - CS-UM6P at SemEval-2022 Task 6: Transformer-based Models for Intended
Sarcasm Detection in English and Arabic [6.221019624345408]
Sarcasm is a form of figurative language where the intended meaning of a sentence differs from its literal meaning.
In this paper, we present our participating system to the intended sarcasm detection task in English and Arabic languages.
arXiv Detail & Related papers (2022-06-16T19:14:54Z) - BFCAI at SemEval-2022 Task 6: Multi-Layer Perceptron for Sarcasm
Detection in Arabic Texts [0.0]
This paper describes the systems submitted to iSarcasm shared task.
The aim of iSarcasm is to identify the sarcastic contents in Arabic and English text.
A multi-Layer machine learning based model has been submitted for Arabic sarcasm detection.
arXiv Detail & Related papers (2022-05-18T11:33:07Z) - Multilingual AMR Parsing with Noisy Knowledge Distillation [68.01173640691094]
We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR by using an existing English as its teacher.
We identify that noisy input and precise output are the key to successful distillation.
arXiv Detail & Related papers (2021-09-30T15:13:48Z) - sarcasm detection and quantification in arabic tweets [7.173484352846755]
This paper intends to create a new humanly annotated Arabic corpus for sarcasm detection collected from tweets.
The proposed approach tackles the problem as a regression problem instead of classification.
arXiv Detail & Related papers (2021-08-03T11:48:27Z) - Combining Context-Free and Contextualized Representations for Arabic
Sarcasm Detection and Sentiment Identification [0.0]
This paper proffers team SPPU-AASM's submission for the WANLP ArSarcasm shared-task 2021, which centers around the sarcasm and sentiment polarity detection of Arabic tweets.
The proposed system achieves a F1-sarcastic score of 0.62 and a F-PN score of 0.715 for the sarcasm and sentiment detection tasks, respectively.
arXiv Detail & Related papers (2021-03-09T19:39:43Z) - AraBERT and Farasa Segmentation Based Approach For Sarcasm and Sentiment
Detection in Arabic Tweets [0.0]
One of the subtasks aims at developing a system that identifies whether a given Arabic tweet is sarcastic in nature or not.
The other aims to identify the sentiment of the Arabic tweet.
Our final approach was ranked seventh and fourth in the Sarcasm and Sentiment Detection subtasks respectively.
arXiv Detail & Related papers (2021-03-02T12:33:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.