Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based
Approach to Offensive Language Identification
- URL: http://arxiv.org/abs/2009.10792v1
- Date: Tue, 22 Sep 2020 20:13:48 GMT
- Title: Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based
Approach to Offensive Language Identification
- Authors: Ehsan Doostmohammadi, Hossein Sameti, Ali Saffar
- Abstract summary: OffensEval addresses the problem of identifying and categorizing offensive language in social media.
The proposed approach includes character-level Convolutional Neural Network, word-level Recurrent Neural Network, and some preprocessing.
The performance achieved by the proposed model for subtask A is 77.93% macro-averaged F1-score.
- Score: 1.192436948211501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the models submitted by Ghmerti team for subtasks A and B
of the OffensEval shared task at SemEval 2019. OffensEval addresses the problem
of identifying and categorizing offensive language in social media in three
subtasks; whether or not a content is offensive (subtask A), whether it is
targeted (subtask B) towards an individual, a group, or other entities (subtask
C). The proposed approach includes character-level Convolutional Neural
Network, word-level Recurrent Neural Network, and some preprocessing. The
performance achieved by the proposed model for subtask A is 77.93%
macro-averaged F1-score.
Related papers
- SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - Exploiting Contextual Target Attributes for Target Sentiment
Classification [53.30511968323911]
Existing PTLM-based models for TSC can be categorized into two groups: 1) fine-tuning-based models that adopt PTLM as the context encoder; 2) prompting-based models that transfer the classification task to the text/word generation task.
We present a new perspective of leveraging PTLM for TSC: simultaneously leveraging the merits of both language modeling and explicit target-context interactions via contextual target attributes.
arXiv Detail & Related papers (2023-12-21T11:45:28Z) - CL-UZH at SemEval-2023 Task 10: Sexism Detection through Incremental
Fine-Tuning and Multi-Task Learning with Label Descriptions [0.0]
SemEval shared task textitTowards Explainable Detection of Online Sexism (EDOS 2023) is to detect sexism in English social media posts.
We present our submitted systems for all three subtasks, based on a multi-task model that has been fine-tuned on a range of related tasks.
We implement multi-task learning by formulating each task as binary pairwise text classification, where the dataset and label descriptions are given along with the input text.
arXiv Detail & Related papers (2023-06-06T17:59:49Z) - Subsidiary Prototype Alignment for Universal Domain Adaptation [58.431124236254]
A major problem in Universal Domain Adaptation (UniDA) is misalignment of "known" and "unknown" classes.
We propose a novel word-histogram-related pretext task to enable closed-set SPA, operating in conjunction with goal task UniDA.
We demonstrate the efficacy of our approach on top of existing UniDA techniques, yielding state-of-the-art performance across three standard UniDA and Open-Set DA object recognition benchmarks.
arXiv Detail & Related papers (2022-10-28T05:32:14Z) - Identifying and Categorizing Offensive Language in Social Media [0.0]
This study provides a description of a classification system built for SemEval 2019 Task 6: OffensEval.
We trained machine learning and deep learning models along with data preprocessing and sampling techniques to come up with the best results.
arXiv Detail & Related papers (2021-04-10T22:53:43Z) - LRG at SemEval-2021 Task 4: Improving Reading Comprehension with
Abstract Words using Augmentation, Linguistic Features and Voting [0.6850683267295249]
Given a fill-in-the-blank-type question, the task is to predict the most suitable word from a list of 5 options.
We use encoders of transformers-based models pre-trained on the masked language modelling (MLM) task to build our Fill-in-the-blank (FitB) models.
We propose variants, namely Chunk Voting and Max Context, to take care of input length restrictions for BERT, etc.
arXiv Detail & Related papers (2021-02-24T12:33:12Z) - PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models'
features for offensive language recognition [0.0]
Our team was ranked 7th out of 40 in Sub-task C - Offense target identification with 64.727% macro F1-score and 64th out of 85 in Sub-task A - Offensive language identification (89.726% F1-score)
arXiv Detail & Related papers (2020-10-05T10:25:29Z) - Meta-Learning with Context-Agnostic Initialisations [86.47040878540139]
We introduce a context-adversarial component into the meta-learning process.
This produces an initialisation for fine-tuning to target which is context-agnostic and task-generalised.
We evaluate our approach on three commonly used meta-learning algorithms and two problems.
arXiv Detail & Related papers (2020-07-29T08:08:38Z) - GUIR at SemEval-2020 Task 12: Domain-Tuned Contextualized Models for
Offensive Language Detection [27.45642971636561]
OffensEval 2020 task includes three English sub-tasks: identifying the presence of offensive language (Sub-task A), identifying the presence of target in offensive language (Sub-task B), and identifying the categories of the target (Sub-task C)
Our submissions achieve F1 scores of 91.7% in Sub-task A, 66.5% in Sub-task B, and 63.2% in Sub-task C.
arXiv Detail & Related papers (2020-07-28T20:45:43Z) - Boundary-assisted Region Proposal Networks for Nucleus Segmentation [89.69059532088129]
Machine learning models cannot perform well because of large amount of crowded nuclei.
We devise a Boundary-assisted Region Proposal Network (BRP-Net) that achieves robust instance-level nucleus segmentation.
arXiv Detail & Related papers (2020-06-04T08:26:38Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.