Toxic Comments Hunter : Score Severity of Toxic Comments
- URL: http://arxiv.org/abs/2203.03548v1
- Date: Tue, 15 Feb 2022 07:35:52 GMT
- Title: Toxic Comments Hunter : Score Severity of Toxic Comments
- Authors: Zhichang Wang and Qipeng Zhu
- Abstract summary: In this experiment, we collect various data sets related to toxic comments.
Because of the characteristics of comment data, we perform data cleaning and feature extraction operations on it.
In terms of model construction, we used the training set to train the models based on TFIDF and finetuned the Bert model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The detection and identification of toxic comments are conducive to creating
a civilized and harmonious Internet environment. In this experiment, we
collected various data sets related to toxic comments. Because of the
characteristics of comment data, we perform data cleaning and feature
extraction operations on it from different angles to obtain different toxic
comment training sets. In terms of model construction, we used the training set
to train the models based on TFIDF and finetuned the Bert model separately.
Finally, we encapsulated the code into software to score toxic comments in
real-time.
Related papers
- Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - ToxiSpanSE: An Explainable Toxicity Detection in Code Review Comments [4.949881799107062]
ToxiSpanSE is the first tool to detect toxic spans in the Software Engineering (SE) domain.
Our model achieved the best score with 0.88 $F1$, 0.87 precision, and 0.93 recall for toxic class tokens.
arXiv Detail & Related papers (2023-07-07T04:55:11Z) - On the Exploitability of Instruction Tuning [103.8077787502381]
In this work, we investigate how an adversary can exploit instruction tuning to change a model's behavior.
We propose textitAutoPoison, an automated data poisoning pipeline.
Our results show that AutoPoison allows an adversary to change a model's behavior by poisoning only a small fraction of data.
arXiv Detail & Related papers (2023-06-28T17:54:04Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - A Pretrainer's Guide to Training Data: Measuring the Effects of Data
Age, Domain Coverage, Quality, & Toxicity [84.6421260559093]
This study is the largest set of experiments to validate, quantify, and expose undocumented intuitions about text pretraining.
Our findings indicate there does not exist a one-size-fits-all solution to filtering training data.
arXiv Detail & Related papers (2023-05-22T15:57:53Z) - Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on
Toxicity Annotation [1.1699472346137738]
We study how raters' self-described identities impact how they annotate toxicity in online comments.
We found that rater identity is a statistically significant factor in how raters will annotate toxicity for identity-related annotations.
We trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets.
arXiv Detail & Related papers (2022-05-01T16:08:48Z) - Automated Identification of Toxic Code Reviews: How Far Can We Go? [7.655225472610752]
ToxiCR is a supervised learning-based toxicity identification tool for code review interactions.
ToxiCR significantly outperforms existing toxicity detectors on our dataset.
arXiv Detail & Related papers (2022-02-26T04:27:39Z) - Effect of Toxic Review Content on Overall Product Sentiment [0.0]
In this study, we collect a balanced data set of review comments from 18 different players segregated into three different sectors from google play-store.
We calculate the sentence-level sentiment and toxicity score of individual review content.
We observe that comment toxicity negatively influences overall product sentiment but do not exhibit a mediating effect over reviewer score to influence sector-wise relative rating.
arXiv Detail & Related papers (2022-01-08T16:40:38Z) - Constructive and Toxic Speech Detection for Open-domain Social Media
Comments in Vietnamese [0.32228025627337864]
In this paper, we create a dataset for classifying constructive and toxic speech detection with 10,000 human-annotated comments.
We propose a system for constructive and toxic speech detection with the state-of-the-art transfer learning model in Vietnamese NLP as PhoBERT.
With the results, we can solve some problems on the online discussions and develop the framework for identifying constructiveness and toxicity Vietnamese social media comments automatically.
arXiv Detail & Related papers (2021-03-18T08:04:12Z) - Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language.
We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection.
Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z) - RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment.
We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.