Toxic Language Detection in Social Media for Brazilian Portuguese: New
Dataset and Multilingual Analysis
- URL: http://arxiv.org/abs/2010.04543v1
- Date: Fri, 9 Oct 2020 13:05:19 GMT
- Title: Toxic Language Detection in Social Media for Brazilian Portuguese: New
Dataset and Multilingual Analysis
- Authors: Jo\~ao A. Leite and Diego F. Silva and Kalina Bontcheva and Carolina
Scarton
- Abstract summary: State-of-the-art BERT models were able to achieve 76% macro-F1 score using monolingual data in the binary case.
We show that large-scale monolingual data is still needed to create more accurate models.
- Score: 4.251937086394346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hate speech and toxic comments are a common concern of social media platform
users. Although these comments are, fortunately, the minority in these
platforms, they are still capable of causing harm. Therefore, identifying these
comments is an important task for studying and preventing the proliferation of
toxicity in social media. Previous work in automatically detecting toxic
comments focus mainly in English, with very few work in languages like
Brazilian Portuguese. In this paper, we propose a new large-scale dataset for
Brazilian Portuguese with tweets annotated as either toxic or non-toxic or in
different types of toxicity. We present our dataset collection and annotation
process, where we aimed to select candidates covering multiple demographic
groups. State-of-the-art BERT models were able to achieve 76% macro-F1 score
using monolingual data in the binary case. We also show that large-scale
monolingual data is still needed to create more accurate models, despite recent
advances in multilingual approaches. An error analysis and experiments with
multi-label classification show the difficulty of classifying certain types of
toxic comments that appear less frequently in our data and highlights the need
to develop models that are aware of different categories of toxicity.
Related papers
- Assessing the Level of Toxicity Against Distinct Groups in Bangla Social Media Comments: A Comprehensive Investigation [0.0]
This study focuses on identifying toxic comments in the Bengali language targeting three specific groups: transgender people, indigenous people, and migrant people.
The methodology involves creating a dataset, manual annotation, and employing pre-trained transformer models like Bangla-BERT, bangla-bert-base, distil-BERT, and Bert-base-multilingual-cased for classification.
The experimental findings reveal that Bangla-BERT surpasses alternative models, achieving an F1-score of 0.8903.
arXiv Detail & Related papers (2024-09-25T17:48:59Z) - FrenchToxicityPrompts: a Large Benchmark for Evaluating and Mitigating Toxicity in French Texts [13.470734853274587]
Large language models (LLMs) are increasingly popular but are also prone to generating bias, toxic or harmful language.
We create and release FrenchToxicityPrompts, a dataset of 50K naturally occurring French prompts.
We evaluate 14 different models from four prevalent open-sourced families of LLMs against our dataset to assess their potential toxicity.
arXiv Detail & Related papers (2024-06-25T14:02:11Z) - From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models [10.807067327137855]
As language models embrace multilingual capabilities, it's crucial our safety measures keep pace.
In the absence of sufficient annotated datasets across languages, we employ translated data to evaluate and enhance our mitigation techniques.
This allows us to examine the effects of translation quality and the cross-lingual transfer on toxicity mitigation.
arXiv Detail & Related papers (2024-03-06T17:51:43Z) - Detecting Unintended Social Bias in Toxic Language Datasets [32.724030288421474]
This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle competition named "Jigsaw Unintended Bias in Toxicity Classification"
The dataset contains instances annotated for five different bias categories, viz., gender, race/ethnicity, religion, political, and LGBTQ.
We train transformer-based models using our curated datasets and report baseline performance for bias identification, target generation, and bias implications.
arXiv Detail & Related papers (2022-10-21T06:50:12Z) - Language Contamination Explains the Cross-lingual Capabilities of
English Pretrained Models [79.38278330678965]
We find that common English pretraining corpora contain significant amounts of non-English text.
This leads to hundreds of millions of foreign language tokens in large-scale datasets.
We then demonstrate that even these small percentages of non-English data facilitate cross-lingual transfer for models trained on them.
arXiv Detail & Related papers (2022-04-17T23:56:54Z) - ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and
Implicit Hate Speech Detection [33.715318646717385]
ToxiGen is a large-scale dataset of 274k toxic and benign statements about 13 minority groups.
Controlling machine generation in this way allows ToxiGen to cover implicitly toxic text at a larger scale.
We find that 94.5% of toxic examples are labeled as hate speech by human annotators.
arXiv Detail & Related papers (2022-03-17T17:57:56Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.