Related papers: Comparative Analysis of Machine Learning and Deep Learning Algorithms for Detection of Online Hate Speech

Comparative Analysis of Machine Learning and Deep Learning Algorithms for Detection of Online Hate Speech

URL: http://arxiv.org/abs/2108.01063v1
Date: Fri, 23 Apr 2021 04:19:15 GMT
Title: Comparative Analysis of Machine Learning and Deep Learning Algorithms for Detection of Online Hate Speech
Authors: Tashvik Dhamija, Anjum, Rahul Katarya
Abstract summary: Several attempts have been made to classify hate speech using machine learning but the state-of-the-art models are not robust enough for practical applications. In this paper, we explored various feature engineering techniques ranging from different embeddings to conventional NLP algorithms. We conclude that BERT based embeddings give the most useful features for this problem and have the capacity to be made into a practical robust model.
Score: 5.543220407902113
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the day and age of social media, users have become prone to online hate speech. Several attempts have been made to classify hate speech using machine learning but the state-of-the-art models are not robust enough for practical applications. This is attributed to the use of primitive NLP feature engineering techniques. In this paper, we explored various feature engineering techniques ranging from different embeddings to conventional NLP algorithms. We also experimented with combinations of different features. From our experimentation, we realized that roBERTa (robustly optimized BERT approach) based sentence embeddings classified using decision trees gives the best results of 0.9998 F1 score. In our paper, we concluded that BERT based embeddings give the most useful features for this problem and have the capacity to be made into a practical robust model.

Related papers

Comparative Analysis of Libraries for the Sentimental Analysis [0.0]
This study is main goal is to provide a comparative comparison of libraries using machine learning methods. Five Python and R libraries NLTK, Textlob Vader, Transformers (GPT and BERT pretrained), and Tidytext will be used in the study to apply sentiment analysis techniques.
arXiv Detail & Related papers (2023-07-26T17:21:53Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
A Context-Sensitive Word Embedding Approach for The Detection of Troll Tweets [0.0]
We develop and evaluate a set of model architectures for the automatic detection of troll tweets. BERT, ELMo, and GloVe embedding methods performed better than the GloVe method. CNN and GRU encoders performed similarly in terms of F1 score and AUC. The best-performing method was found to be an ELMo-based architecture that employed a GRU classifier, with an AUC score of 0.929.
arXiv Detail & Related papers (2022-07-17T17:12:16Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Comparison Analysis of Traditional Machine Learning and Deep Learning Techniques for Data and Image Classification [62.997667081978825]
The purpose of the study is to analyse and compare the most common machine learning and deep learning techniques used for computer vision 2D object classification tasks. Firstly, we will present the theoretical background of the Bag of Visual words model and Deep Convolutional Neural Networks (DCNN) Secondly, we will implement a Bag of Visual Words model, the VGG16 CNN Architecture.
arXiv Detail & Related papers (2022-04-11T11:34:43Z)
On the Efficiency of Integrating Self-supervised Learning and Meta-learning for User-defined Few-shot Keyword Spotting [51.41426141283203]
User-defined keyword spotting is a task to detect new spoken terms defined by users. Previous works try to incorporate self-supervised learning models or apply meta-learning algorithms. Our result shows that HuBERT combined with Matching network achieves the best result.
arXiv Detail & Related papers (2022-04-01T10:59:39Z)
Nearest neighbour approaches for Emotion Detection in Tweets [1.7581155313656314]
We propose an approach using weighted $k$ Nearest Neighbours (kNN), a simple, easy to implement, and explainable machine learning model. In particular, we apply the weighted kNN model to the shared emotion detection task in tweets from SemEval-2018.
arXiv Detail & Related papers (2021-07-08T13:00:06Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
Comparing BERT against traditional machine learning text classification [0.0]
The BERT model has arisen as a popular state-of-the-art machine learning model in the recent years. Our purpose of this work is to add empirical evidence to support or refuse the use of BERT as a default on NLP tasks.
arXiv Detail & Related papers (2020-05-26T20:14:39Z)
Leveraging End-to-End Speech Recognition with Neural Architecture Search [0.0]
We show that a large improvement in the accuracy of deep speech models can be achieved with effective Neural Architecture Optimization. Our method achieves test error of 7% Word Error Rate (WER) on the LibriSpeech corpus and 13% Phone Error Rate (PER) on the TIMIT corpus, on par with state-of-the-art results.
arXiv Detail & Related papers (2019-12-11T08:15:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.