Related papers: SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

URL: http://arxiv.org/abs/2006.07235v2
Date: Wed, 30 Sep 2020 15:46:44 GMT
Title: SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)
Authors: Marcos Zampieri, Preslav Nakov, Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Hamdy Mubarak, Leon Derczynski, Zeses Pitenis, \c{C}a\u{g}r{\i} \c{C}\"oltekin
Abstract summary: We present the results and main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020) OffensEval 2020 was one of the most popular tasks at SemEval-2020 attracting a large number of participants across all subtasks and also across all languages. A total of 528 teams signed up to participate in the task, 145 teams submitted systems during the evaluation period, and 70 submitted system description papers.
Score: 33.66689662526814
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present the results and main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020). The task involves three subtasks corresponding to the hierarchical taxonomy of the OLID schema (Zampieri et al., 2019a) from OffensEval 2019. The task featured five languages: English, Arabic, Danish, Greek, and Turkish for Subtask A. In addition, English also featured Subtasks B and C. OffensEval 2020 was one of the most popular tasks at SemEval-2020 attracting a large number of participants across all subtasks and also across all languages. A total of 528 teams signed up to participate in the task, 145 teams submitted systems during the evaluation period, and 70 submitted system description papers.

Related papers

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human [71.42669028683741]
We present a shared task on binary machine generated text detection conducted as a part of the GenAI workshop at COLING 2025. The task consists of two subtasks: Monolingual (English) and Multilingual. We provide a comprehensive overview of the data, a summary of the results, detailed descriptions of the participating systems, and an in-depth analysis of submissions.
arXiv Detail & Related papers (2025-01-19T11:11:55Z)
Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation [75.03292732779059]
We focus on three language directions: Chinese-English, Chinese-German, and Chinese-Russian. This year, we totally received 10 submissions from 5 academia and industry teams. The official ranking of the systems is based on the overall human judgments.
arXiv Detail & Related papers (2024-12-16T12:54:52Z)
Findings of the IWSLT 2024 Evaluation Campaign [102.7608597658451]
The paper reports on the shared tasks organized by the 21st IWSLT Conference. The shared tasks address 7 scientific challenges in spoken language translation.
arXiv Detail & Related papers (2024-11-07T19:11:55Z)
SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z)
SemEval 2024 -- Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF) [61.49972925493912]
SemEval-2024 Task 10 is a shared task centred on identifying emotions in code-mixed dialogues. This task comprises three distinct subtasks - emotion recognition in conversation for code-mixed dialogues, emotion flip reasoning for code-mixed dialogues, and emotion flip reasoning for English dialogues. A total of 84 participants engaged in this task, with the most adept systems attaining F1-scores of 0.70, 0.79, and 0.76 for the respective subtasks.
arXiv Detail & Related papers (2024-02-29T08:20:06Z)
Overview of Abusive and Threatening Language Detection in Urdu at FIRE 2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language. We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening. For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z)
UPB at SemEval-2020 Task 12: Multilingual Offensive Language Detection on Social Media by Fine-tuning a Variety of BERT-based Models [0.0]
This paper describes our Transformer-based solutions for identifying offensive language on Twitter in five languages. It was employed in Subtask A of the Offenseval 2020 shared task.
arXiv Detail & Related papers (2020-10-26T14:28:29Z)
BRUMS at SemEval-2020 Task 12 : Transformer based Multilingual Offensive Language Identification in Social Media [9.710464466895521]
We present a multilingual deep learning model to identify offensive language in social media. The approach achieves acceptable evaluation scores, while maintaining flexibility between languages.
arXiv Detail & Related papers (2020-10-13T10:39:14Z)
WOLI at SemEval-2020 Task 12: Arabic Offensive Language Identification on Different Twitter Datasets [0.0]
A key to fight offensive language on social media is the existence of an automatic offensive language detection system. In this paper, we describe the system submitted by WideBot AI Lab for the shared task which ranked 10th out of 52 participants with Macro-F1 86.9%. We also introduced a neural network approach that enhanced the predictive ability of our system that includes CNN, highway network, Bi-LSTM, and attention layers.
arXiv Detail & Related papers (2020-09-11T14:10:03Z)
Garain at SemEval-2020 Task 12: Sequence based Deep Learning for Categorizing Offensive Language in Social Media [3.236217153362305]
SemEval-2020 Task 12 was OffenseEval: Multilingual Offensive Language Identification in Social Media. My system on training on 25% of the whole dataset macro averaged f1 score of 47.763%.
arXiv Detail & Related papers (2020-09-02T17:09:29Z)
LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification [19.23116755449024]
We adapt and fine-tune the BERT and Multilingual Bert models made available by Google AI for English and non-English languages respectively. For the English language, we use a combination of two fine-tuned BERT models. For other languages we propose a cross-lingual augmentation approach in order to enrich training data and we use Multilingual BERT to obtain sentence representations.
arXiv Detail & Related papers (2020-05-07T18:45:48Z)
Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models. Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.