Automatic Detection of Sexist Statements Commonly Used at the Workplace
- URL: http://arxiv.org/abs/2007.04181v1
- Date: Wed, 8 Jul 2020 15:14:29 GMT
- Title: Automatic Detection of Sexist Statements Commonly Used at the Workplace
- Authors: Dylan Grosz, Patricia Conde-Cespedes
- Abstract summary: We present a dataset of sexist statements that are more likely to be said in the workplace.
We also present a deep learning model that can achieve state-of-the art results.
- Score: 0.9790236766474201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting hate speech in the workplace is a unique classification task, as
the underlying social context implies a subtler version of conventional hate
speech. Applications regarding a state-of the-art workplace sexism detection
model include aids for Human Resources departments, AI chatbots and sentiment
analysis. Most existing hate speech detection methods, although robust and
accurate, focus on hate speech found on social media, specifically Twitter. The
context of social media is much more anonymous than the workplace, therefore it
tends to lend itself to more aggressive and "hostile" versions of sexism.
Therefore, datasets with large amounts of "hostile" sexism have a slightly
easier detection task since "hostile" sexist statements can hinge on a couple
words that, regardless of context, tip the model off that a statement is
sexist. In this paper we present a dataset of sexist statements that are more
likely to be said in the workplace as well as a deep learning model that can
achieve state-of-the art results. Previous research has created
state-of-the-art models to distinguish "hostile" and "benevolent" sexism based
simply on aggregated Twitter data. Our deep learning methods, initialized with
GloVe or random word embeddings, use LSTMs with attention mechanisms to
outperform those models on a more diverse, filtered dataset that is more
targeted towards workplace sexism, leading to an F1 score of 0.88.
Related papers
- Bilingual Sexism Classification: Fine-Tuned XLM-RoBERTa and GPT-3.5 Few-Shot Learning [0.8192907805418581]
This study aims to improve sexism identification in bilingual contexts (English and Spanish) by leveraging natural language processing models.
We fine-tuned the XLM-RoBERTa model and separately used GPT-3.5 with few-shot learning prompts to classify sexist content.
arXiv Detail & Related papers (2024-06-11T14:15:33Z) - Anti-Sexism Alert System: Identification of Sexist Comments on Social
Media Using AI Techniques [0.0]
Sexist comments that are publicly posted in social media (newspaper comments, social networks, etc.) usually obtain a lot of attention and become viral, with consequent damage to the persons involved.
In this paper, we introduce an anti-sexism alert system, based on natural language processing (NLP) and artificial intelligence (AI)
This system analyzes any public post, and decides if it could be considered a sexist comment or not.
arXiv Detail & Related papers (2023-11-28T19:48:46Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - Discrimination through Image Selection by Job Advertisers on Facebook [79.21648699199648]
We propose and investigate the prevalence of a new means for discrimination in job advertising.
It combines both targeting and delivery -- through the disproportionate representation or exclusion of people of certain demographics in job ad images.
We use the Facebook Ad Library to demonstrate the prevalence of this practice.
arXiv Detail & Related papers (2023-06-13T03:43:58Z) - SemEval-2023 Task 10: Explainable Detection of Online Sexism [5.542286527528687]
We introduce SemEval Task 10 on the Explainable Detection of Online Sexism (EDOS)
We make three main contributions: i) a novel hierarchical taxonomy of sexist content, which includes granular vectors of sexism to aid explainability; ii) a new dataset of 20,000 social media comments with fine-grained labels, along with larger unlabelled datasets for model adaptation; andiii) baseline models as well as an analysis of the methods, results and errors for participant submissions to our task.
arXiv Detail & Related papers (2023-03-07T20:28:39Z) - Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models.
By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes.
We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z) - SexWEs: Domain-Aware Word Embeddings via Cross-lingual Semantic
Specialisation for Chinese Sexism Detection in Social Media [23.246615034191553]
We develop a cross-lingual domain-aware semantic specialisation system for sexism detection.
We leverage semantic resources for sexism from a high-resource language (English) to specialise pre-trained word vectors in the target language (Chinese) to inject domain knowledge.
Compared with other specialisation approaches and Chinese baseline word vectors, our SexWEs shows an average score improvement of 0.033 and 0.064 in both intrinsic and extrinsic evaluations.
arXiv Detail & Related papers (2022-11-15T19:00:20Z) - Rumor Detection with Self-supervised Learning on Texts and Social Graph [101.94546286960642]
We propose contrastive self-supervised learning on heterogeneous information sources, so as to reveal their relations and characterize rumors better.
We term this framework as Self-supervised Rumor Detection (SRD)
Extensive experiments on three real-world datasets validate the effectiveness of SRD for automatic rumor detection on social media.
arXiv Detail & Related papers (2022-04-19T12:10:03Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - "Call me sexist, but...": Revisiting Sexism Detection Using
Psychological Scales and Adversarial Samples [2.029924828197095]
We outline the different dimensions of sexism by grounding them in their implementation in psychological scales.
From the scales, we derive a codebook for sexism in social media, which we use to annotate existing and novel datasets.
Results indicate that current machine learning models pick up on a very narrow set of linguistic markers of sexism and do not generalize well to out-of-domain examples.
arXiv Detail & Related papers (2020-04-27T13:07:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.