COLD: A Benchmark for Chinese Offensive Language Detection
- URL: http://arxiv.org/abs/2201.06025v1
- Date: Sun, 16 Jan 2022 11:47:23 GMT
- Title: COLD: A Benchmark for Chinese Offensive Language Detection
- Authors: Jiawen Deng, Jingyan Zhou, Hao Sun, Fei Mi, Minlie Huang
- Abstract summary: We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
- Score: 54.60909500459201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Offensive language detection and prevention becomes increasing critical for
maintaining a healthy social platform and the safe deployment of language
models. Despite plentiful researches on toxic and offensive language problem in
NLP, existing studies mainly focus on English, while few researches involve
Chinese due to the limitation of resources. To facilitate Chinese offensive
language detection and model evaluation, we collect COLDataset, a Chinese
offensive language dataset containing 37k annotated sentences. With this
high-quality dataset, we provide a strong baseline classifier, COLDetector,
with 81% accuracy for offensive language detection. Furthermore, we also
utilize the proposed \textsc{COLDetector} to study output offensiveness of
popular Chinese language models (CDialGPT and CPM). We find that (1) CPM tends
to generate more offensive output than CDialGPT, and (2) certain type of
prompts, like anti-bias sentences, can trigger offensive outputs more
easily.Altogether, our resources and analyses are intended to help detoxify the
Chinese online communities and evaluate the safety performance of generative
language models. Disclaimer: The paper contains example data that may be
considered profane, vulgar, or offensive.
Related papers
- ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations [6.360597788845826]
This study examines the limitations of state-of-the-art large language models (LLMs) in identifying offensive content within systematically perturbed data.
Our work highlights the urgent need for more advanced techniques in offensive language detection to combat the evolving tactics used to evade detection mechanisms.
arXiv Detail & Related papers (2024-06-18T02:44:56Z) - Zero-shot Cross-lingual Stance Detection via Adversarial Language Adaptation [7.242609314791262]
This paper introduces a novel approach to zero-shot cross-lingual stance detection, Multilingual Translation-Augmented BERT (MTAB)
Our technique employs translation augmentation to improve zero-shot performance and pairs it with adversarial learning to further boost model efficacy.
We demonstrate the effectiveness of our proposed approach, showcasing improved results in comparison to a strong baseline model as well as ablated versions of our model.
arXiv Detail & Related papers (2024-04-22T16:56:43Z) - From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models [10.807067327137855]
As language models embrace multilingual capabilities, it's crucial our safety measures keep pace.
In the absence of sufficient annotated datasets across languages, we employ translated data to evaluate and enhance our mitigation techniques.
This allows us to examine the effects of translation quality and the cross-lingual transfer on toxicity mitigation.
arXiv Detail & Related papers (2024-03-06T17:51:43Z) - Vicinal Risk Minimization for Few-Shot Cross-lingual Transfer in Abusive
Language Detection [19.399281609371258]
Cross-lingual transfer learning from high-resource to medium and low-resource languages has shown encouraging results.
We resort to data augmentation and continual pre-training for domain adaptation to improve cross-lingual abusive language detection.
arXiv Detail & Related papers (2023-11-03T16:51:07Z) - NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages.
We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.
Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z) - Expanding Scope: Adapting English Adversarial Attacks to Chinese [11.032727439758661]
This paper investigates how to adapt SOTA adversarial attack algorithms in English to the Chinese language.
Our experiments show that attack methods previously applied to English NLP can generate high-quality adversarial examples in Chinese.
In addition, we demonstrate that the generated adversarial examples can achieve high fluency and semantic consistency.
arXiv Detail & Related papers (2023-06-08T02:07:49Z) - No Language Left Behind: Scaling Human-Centered Machine Translation [69.28110770760506]
We create datasets and models aimed at narrowing the performance gap between low and high-resource languages.
We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks.
Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art.
arXiv Detail & Related papers (2022-07-11T07:33:36Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.