Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
- URL: http://arxiv.org/abs/2402.02144v1
- Date: Sat, 3 Feb 2024 13:23:51 GMT
- Title: Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
- Authors: Sarah Masud, Mohammad Aflah Khan, Vikram Goyal, Md Shad Akhtar, Tanmoy
Chakraborty
- Abstract summary: Despite widespread adoption, there is a lack of research into how various critical aspects of pretrained language models affect their performance in hate speech detection.
We deep dive into comparing different pretrained models, evaluating their seed robustness, finetuning settings, and the impact of pretraining data collection time.
Our analysis reveals early peaks for downstream tasks during pretraining, the limited benefit of employing a more recent pretraining corpus, and the significance of specific layers during finetuning.
- Score: 39.970726250810635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the widespread adoption, there is a lack of research into how various
critical aspects of pretrained language models (PLMs) affect their performance
in hate speech detection. Through five research questions, our findings and
recommendations lay the groundwork for empirically investigating different
aspects of PLMs' use in hate speech detection. We deep dive into comparing
different pretrained models, evaluating their seed robustness, finetuning
settings, and the impact of pretraining data collection time. Our analysis
reveals early peaks for downstream tasks during pretraining, the limited
benefit of employing a more recent pretraining corpus, and the significance of
specific layers during finetuning. We further call into question the use of
domain-specific models and highlight the need for dynamic datasets for
benchmarking hate speech detection.
Related papers
- Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection [10.014248704653]
This study investigates the effectiveness and adaptability of pre-trained and fine-tuned Large Language Models (LLMs) in identifying hate speech.
LLMs offer a huge advantage over the state-of-the-art even without pretraining.
We conclude with a vision for the future of hate speech detection, emphasizing cross-domain generalizability.
arXiv Detail & Related papers (2023-10-29T10:07:32Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - On the Challenges of Building Datasets for Hate Speech Detection [0.0]
We first analyze the issues surrounding hate speech detection through a data-centric lens.
We then outline a holistic framework to encapsulate the data creation pipeline across seven broad dimensions.
arXiv Detail & Related papers (2023-09-06T11:15:47Z) - Analysing the Impact of Audio Quality on the Use of Naturalistic
Long-Form Recordings for Infant-Directed Speech Research [62.997667081978825]
Modelling of early language acquisition aims to understand how infants bootstrap their language skills.
Recent developments have enabled the use of more naturalistic training data for computational models.
It is currently unclear how the sound quality could affect analyses and modelling experiments conducted on such data.
arXiv Detail & Related papers (2023-05-03T08:25:37Z) - Comparative layer-wise analysis of self-supervised speech models [29.258085176788097]
We measure acoustic, phonetic, and word-level properties encoded in individual layers, using a lightweight analysis tool based on canonical correlation analysis (CCA)
We find that these properties evolve across layers differently depending on the model, and the variations relate to the choice of pre-training objective.
We discover that CCA trends provide reliable guidance to choose layers of interest for downstream tasks and that single-layer performance often matches or improves upon using all layers, suggesting implications for more efficient use of pre-trained models.
arXiv Detail & Related papers (2022-11-08T00:59:05Z) - Improving Distortion Robustness of Self-supervised Speech Processing
Tasks with Domain Adaptation [60.26511271597065]
Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models.
It is high time that we enhance the robustness of speech processing models to obtain good performance when encountering speech distortions.
arXiv Detail & Related papers (2022-03-30T07:25:52Z) - Deep Learning for Hate Speech Detection: A Comparative Study [54.42226495344908]
We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods.
Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art.
In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions.
arXiv Detail & Related papers (2022-02-19T03:48:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.