A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information
- URL: http://arxiv.org/abs/2411.06855v2
- Date: Tue, 12 Nov 2024 16:03:24 GMT
- Title: A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information
- Authors: Prashant Kapil, Asif Ekbal,
- Abstract summary: Hate speech, offensive language, aggression, racism, sexism, and other abusive language are common phenomena in social media.
There is a need for Artificial Intelligence(AI)based intervention which can filter hate content at scale.
This paper introduces a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information.
- Score: 23.017068553977982
- License:
- Abstract: Hate speech, offensive language, aggression, racism, sexism, and other abusive language are common phenomena in social media. There is a need for Artificial Intelligence(AI)based intervention which can filter hate content at scale. Most existing hate speech detection solutions have utilized the features by treating each post as an isolated input instance for the classification. This paper addresses this issue by introducing a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information. The experiment is conducted over single-task learning (STL) and multi-task learning (MTL) paradigms that use deep neural networks, such as convolutional neural networks (CNN), gated recurrent unit (GRU), bidirectional encoder representations from the transformer (BERT), and A Lite BERT (ALBERT). We use three benchmark datasets and conclude that combining certain user features with textual features gives significant improvements in macro-F1 and weighted-F1.
Related papers
- Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - Hierarchical Sentiment Analysis Framework for Hate Speech Detection: Implementing Binary and Multiclass Classification Strategy [0.0]
We propose a new multitask model integrated with shared emotional representations to detect hate speech across the English language.
We conclude that utilizing sentiment analysis and a Transformer-based trained model considerably improves hate speech detection across multiple datasets.
arXiv Detail & Related papers (2024-11-03T04:11:33Z) - Explaining Interactions Between Text Spans [50.70253702800355]
Reasoning over spans of tokens from different parts of the input is essential for natural language understanding.
We introduce SpanEx, a dataset of human span interaction explanations for two NLU tasks: NLI and FC.
We then investigate the decision-making processes of multiple fine-tuned large language models in terms of the employed connections between spans.
arXiv Detail & Related papers (2023-10-20T13:52:37Z) - Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts [91.3755431537592]
The massive collection of user posts across social media platforms is primarily untapped for artificial intelligence (AI) use cases.
Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding.
This study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery.
arXiv Detail & Related papers (2023-07-05T20:16:20Z) - Hate Speech and Offensive Language Detection using an Emotion-aware
Shared Encoder [1.8734449181723825]
Existing works on hate speech and offensive language detection produce promising results based on pre-trained transformer models.
This paper addresses a multi-task joint learning approach which combines external emotional features extracted from another corpora.
Our findings demonstrate that emotional knowledge helps to more reliably identify hate speech and offensive language across datasets.
arXiv Detail & Related papers (2023-02-17T09:31:06Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - Offensive Language and Hate Speech Detection with Deep Learning and
Transfer Learning [1.77356577919977]
We propose an approach to automatically classify tweets into three classes: Hate, offensive and Neither.
We create a class module which contains main functionality including text classification, sentiment checking and text data augmentation.
arXiv Detail & Related papers (2021-08-06T20:59:47Z) - Leveraging Multi-domain, Heterogeneous Data using Deep Multitask
Learning for Hate Speech Detection [21.410160004193916]
We propose a Convolution Neural Network based multi-task learning models (MTLs)footnotecode to leverage information from multiple sources.
Empirical analysis performed on three benchmark datasets shows the efficacy of the proposed approach.
arXiv Detail & Related papers (2021-03-23T09:31:01Z) - A study of text representations in Hate Speech Detection [0.0]
Current EU and US legislation against hateful language has led to automatic tools being a necessary component of the Hate Speech detection task and pipeline.
In this study, we examine the performance of several, diverse text representation techniques paired with multiple classification algorithms, on the automatic Hate Speech detection task.
arXiv Detail & Related papers (2021-02-08T20:39:17Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.