Dynamically Refined Regularization for Improving Cross-corpora Hate
Speech Detection
- URL: http://arxiv.org/abs/2203.12536v1
- Date: Wed, 23 Mar 2022 16:58:10 GMT
- Title: Dynamically Refined Regularization for Improving Cross-corpora Hate
Speech Detection
- Authors: Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
- Abstract summary: Hate speech classifiers exhibit substantial performance degradation when evaluated on datasets different from the source.
Previous work has attempted to mitigate this problem by regularizing specific terms from pre-defined static dictionaries.
We propose to automatically identify and reduce spurious correlations using attribution methods with dynamic refinement of the list of terms.
- Score: 30.462596705180534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hate speech classifiers exhibit substantial performance degradation when
evaluated on datasets different from the source. This is due to learning
spurious correlations between words that are not necessarily relevant to
hateful language, and hate speech labels from the training corpus. Previous
work has attempted to mitigate this problem by regularizing specific terms from
pre-defined static dictionaries. While this has been demonstrated to improve
the generalizability of classifiers, the coverage of such methods is limited
and the dictionaries require regular manual updates from human experts. In this
paper, we propose to automatically identify and reduce spurious correlations
using attribution methods with dynamic refinement of the list of terms that
need to be regularized during training. Our approach is flexible and improves
the cross-corpora performance over previous work independently and in
combination with pre-defined dictionaries.
Related papers
- An Analysis of BPE Vocabulary Trimming in Neural Machine Translation [56.383793805299234]
vocabulary trimming is a postprocessing step that replaces rare subwords with their component subwords.
We show that vocabulary trimming fails to improve performance and is even prone to incurring heavy degradation.
arXiv Detail & Related papers (2024-03-30T15:29:49Z) - Generalized Time Warping Invariant Dictionary Learning for Time Series
Classification and Clustering [8.14208923345076]
The dynamic time warping (DTW) is commonly used for dealing with temporal delays, scaling, transformation, and many other kinds of temporal misalignments issues.
We propose a generalized time warping invariant dictionary learning algorithm in this paper.
The superiority of the proposed method in terms of dictionary learning, classification, and clustering is validated through ten sets of public datasets.
arXiv Detail & Related papers (2023-06-30T14:18:13Z) - Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis [69.07674653828565]
Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances.
In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis.
We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
arXiv Detail & Related papers (2023-05-23T03:55:50Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class
Classification [57.62886091828512]
We propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix) for many-class classification.
Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification.
arXiv Detail & Related papers (2022-11-11T03:45:59Z) - Dictionary-Assisted Supervised Contrastive Learning [0.0]
We introduce the dictionary-assisted supervised contrastive learning (DASCL) objective, allowing researchers to leverage specialized dictionaries.
The text is first keyword simplified: a common, fixed token replaces any word in the corpus that appears in the dictionary(ies) relevant to the concept of interest.
DASCL and cross-entropy improves classification performance metrics in few-shot learning settings and social science applications.
arXiv Detail & Related papers (2022-10-27T04:57:43Z) - Improving Contextual Recognition of Rare Words with an Alternate
Spelling Prediction Model [0.0]
We release contextual biasing lists to accompany the Earnings21 dataset.
We show results for shallow fusion contextual biasing applied to two different decoding algorithms.
We propose an alternate spelling prediction model that improves recall of rare words by 34.7% relative.
arXiv Detail & Related papers (2022-09-02T19:30:16Z) - Consistency Regularization for Cross-Lingual Fine-Tuning [61.08704789561351]
We propose to improve cross-lingual fine-tuning with consistency regularization.
Specifically, we use example consistency regularization to penalize the prediction sensitivity to four types of data augmentations.
Experimental results on the XTREME benchmark show that our method significantly improves cross-lingual fine-tuning across various tasks.
arXiv Detail & Related papers (2021-06-15T15:35:44Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z) - Text Classification with Few Examples using Controlled Generalization [58.971750512415134]
Current practice relies on pre-trained word embeddings to map words unseen in training to similar seen ones.
Our alternative begins with sparse pre-trained representations derived from unlabeled parsed corpora.
We show that a feed-forward network over these vectors is especially effective in low-data scenarios.
arXiv Detail & Related papers (2020-05-18T06:04:58Z) - Fast and Robust Unsupervised Contextual Biasing for Speech Recognition [16.557586847398778]
We propose an alternative approach that does not entail explicit contextual language model.
We derive the bias score for every word in the system vocabulary from the training corpus.
We show significant improvement in recognition accuracy when the relevant context is available.
arXiv Detail & Related papers (2020-05-04T17:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.