Leveraging Multi-domain, Heterogeneous Data using Deep Multitask
Learning for Hate Speech Detection
- URL: http://arxiv.org/abs/2103.12412v1
- Date: Tue, 23 Mar 2021 09:31:01 GMT
- Title: Leveraging Multi-domain, Heterogeneous Data using Deep Multitask
Learning for Hate Speech Detection
- Authors: Prashant Kapil, Asif Ekbal
- Abstract summary: We propose a Convolution Neural Network based multi-task learning models (MTLs)footnotecode to leverage information from multiple sources.
Empirical analysis performed on three benchmark datasets shows the efficacy of the proposed approach.
- Score: 21.410160004193916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the exponential rise in user-generated web content on social media, the
proliferation of abusive languages towards an individual or a group across the
different sections of the internet is also rapidly increasing. It is very
challenging for human moderators to identify the offensive contents and filter
those out. Deep neural networks have shown promise with reasonable accuracy for
hate speech detection and allied applications. However, the classifiers are
heavily dependent on the size and quality of the training data. Such a
high-quality large data set is not easy to obtain. Moreover, the existing data
sets that have emerged in recent times are not created following the same
annotation guidelines and are often concerned with different types and
sub-types related to hate. To solve this data sparsity problem, and to obtain
more global representative features, we propose a Convolution Neural Network
(CNN) based multi-task learning models (MTLs)\footnote{code is available at
https://github.com/imprasshant/STL-MTL} to leverage information from multiple
sources. Empirical analysis performed on three benchmark datasets shows the
efficacy of the proposed approach with the significant improvement in accuracy
and F-score to obtain state-of-the-art performance with respect to the existing
systems.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Mutual Information Learned Classifiers: an Information-theoretic
Viewpoint of Training Deep Learning Classification Systems [9.660129425150926]
Cross entropy loss can easily lead us to find models which demonstrate severe overfitting behavior.
In this paper, we prove that the existing cross entropy loss minimization for training DNN classifiers essentially learns the conditional entropy of the underlying data distribution.
We propose a mutual information learning framework where we train DNN classifiers via learning the mutual information between the label and input.
arXiv Detail & Related papers (2022-10-03T15:09:19Z) - Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case
Study of Political Public Figures [7.52579126252489]
We propose a new Multi-task Learning pipeline that utilizes MTL to train simultaneously across multiple hate speech datasets.
We show strong results when examining generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets.
We also assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures.
arXiv Detail & Related papers (2022-08-22T21:13:38Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Character-level HyperNetworks for Hate Speech Detection [3.50640918825436]
Automated methods for hate speech detection typically employ state-of-the-art deep learning (DL)-based text classifiers.
We present HyperNetworks for hate speech detection, a special class of DL networks whose weights are regulated by a small-scale auxiliary network.
We achieve performance that is comparable or better than state-of-the-art language models, which are pre-trained and orders of magnitude larger.
arXiv Detail & Related papers (2021-11-11T17:48:31Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - A study of text representations in Hate Speech Detection [0.0]
Current EU and US legislation against hateful language has led to automatic tools being a necessary component of the Hate Speech detection task and pipeline.
In this study, we examine the performance of several, diverse text representation techniques paired with multiple classification algorithms, on the automatic Hate Speech detection task.
arXiv Detail & Related papers (2021-02-08T20:39:17Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.