HausaNLP at SemEval-2023 Task 10: Transfer Learning, Synthetic Data and
Side-Information for Multi-Level Sexism Classification
- URL: http://arxiv.org/abs/2305.00076v1
- Date: Fri, 28 Apr 2023 20:03:46 GMT
- Title: HausaNLP at SemEval-2023 Task 10: Transfer Learning, Synthetic Data and
Side-Information for Multi-Level Sexism Classification
- Authors: Saminu Mohammad Aliyu, Idris Abdulmumin, Shamsuddeen Hassan Muhammad,
Ibrahim Said Ahmad, Saheed Abdullahi Salahudeen, Aliyu Yusuf, Falalu Ibrahim
Lawan
- Abstract summary: We present the findings of our participation in the SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS) task.
We investigated the effects of transferring two language models: XLM-T (sentiment classification) and HateBERT (same domain -- Reddit) for multi-level classification into Sexist or not Sexist.
- Score: 0.007696728525672149
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the findings of our participation in the SemEval-2023 Task 10:
Explainable Detection of Online Sexism (EDOS) task, a shared task on offensive
language (sexism) detection on English Gab and Reddit dataset. We investigated
the effects of transferring two language models: XLM-T (sentiment
classification) and HateBERT (same domain -- Reddit) for multi-level
classification into Sexist or not Sexist, and other subsequent
sub-classifications of the sexist data. We also use synthetic classification of
unlabelled dataset and intermediary class information to maximize the
performance of our models. We submitted a system in Task A, and it ranked 49th
with F1-score of 0.82. This result showed to be competitive as it only
under-performed the best system by 0.052% F1-score.
Related papers
- GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for
Sexism Detection and Classification [0.0]
SemEval-2023 Task 10 on Explainable Detection of Online Sexism aims at increasing explainability of the sexism detection.
Our system is based on further domain-adaptive pre-training.
In experiments, multi-task learning performs on par with standard fine-tuning for sexism detection.
arXiv Detail & Related papers (2023-06-08T09:56:57Z) - Attention at SemEval-2023 Task 10: Explainable Detection of Online
Sexism (EDOS) [15.52876591707497]
We have worked on interpretability, trust, and understanding of the decisions made by models in the form of classification tasks.
The first task consists of determining Binary Sexism Detection.
The second task describes the Category of Sexism.
The third task describes a more Fine-grained Category of Sexism.
arXiv Detail & Related papers (2023-04-10T14:24:52Z) - SSS at SemEval-2023 Task 10: Explainable Detection of Online Sexism
using Majority Voted Fine-Tuned Transformers [0.0]
This paper describes our submission to Task 10 at SemEval 2023-Explainable Detection of Online Sexism (EDOS)
The recent rise in social media platforms has seen an increase in disproportionate levels of sexism experienced by women on social media platforms.
Our approach consists of experimenting and finetuning BERT-based models and using a Majority Voting ensemble model that outperforms individual baseline model scores.
arXiv Detail & Related papers (2023-04-07T07:24:32Z) - Towards Understanding Gender-Seniority Compound Bias in Natural Language
Generation [64.65911758042914]
We investigate how seniority impacts the degree of gender bias exhibited in pretrained neural generation models.
Our results show that GPT-2 amplifies bias by considering women as junior and men as senior more often than the ground truth in both domains.
These results suggest that NLP applications built using GPT-2 may harm women in professional capacities.
arXiv Detail & Related papers (2022-05-19T20:05:02Z) - Sexism Prediction in Spanish and English Tweets Using Monolingual and
Multilingual BERT and Ensemble Models [0.0]
This work proposes a system to use multilingual and monolingual BERT and data points translation and ensemble strategies for sexism identification and classification in English and Spanish.
arXiv Detail & Related papers (2021-11-08T15:01:06Z) - Improving Gender Fairness of Pre-Trained Language Models without
Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin.
We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z) - Automatic Sexism Detection with Multilingual Transformer Models [0.0]
This paper presents the contribution of the AIT_FHSTP team at the EXIST 2021 benchmark for two sEXism Identification in Social neTworks tasks.
To solve the tasks we applied two multilingual transformer models, one based on multilingual BERT and one based on XLM-R.
Our approach uses two different strategies to adapt the transformers to the detection of sexist content: first, unsupervised pre-training with additional data and second, supervised fine-tuning with additional and augmented data.
For both tasks our best model is XLM-R with unsupervised pre-training on the EXIST data and additional datasets
arXiv Detail & Related papers (2021-06-09T08:45:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.