Related papers: ToVo: Toxicity Taxonomy via Voting

ToVo: Toxicity Taxonomy via Voting

URL: http://arxiv.org/abs/2406.14835v2
Date: Sun, 29 Sep 2024 15:08:14 GMT
Title: ToVo: Toxicity Taxonomy via Voting
Authors: Tinh Son Luong, Thanh-Thien Le, Thang Viet Doan, Linh Ngo Van, Thien Huu Nguyen, Diep Thi-Ngoc Nguyen,
Abstract summary: We propose a dataset creation mechanism that integrates voting and chain-of-thought processes. Our methodology ensures diverse classification metrics for each sample. We utilize the dataset created through our proposed mechanism to train our model.
Score: 25.22398575368979
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing toxic detection models face significant limitations, such as lack of transparency, customization, and reproducibility. These challenges stem from the closed-source nature of their training data and the paucity of explanations for their evaluation mechanism. To address these issues, we propose a dataset creation mechanism that integrates voting and chain-of-thought processes, producing a high-quality open-source dataset for toxic content detection. Our methodology ensures diverse classification metrics for each sample and includes both classification scores and explanatory reasoning for the classifications. We utilize the dataset created through our proposed mechanism to train our model, which is then compared against existing widely-used detectors. Our approach not only enhances transparency and customizability but also facilitates better fine-tuning for specific use cases. This work contributes a robust framework for developing toxic content detection models, emphasizing openness and adaptability, thus paving the way for more effective and user-specific content moderation solutions.

Related papers

Improving Omics-Based Classification: The Role of Feature Selection and Synthetic Data Generation [0.18846515534317262]
This study presents a machine learning based classification framework that integrates feature selection with data augmentation techniques.<n>We show that the proposed pipeline yields cross validated perfomance on small dataset.
arXiv Detail & Related papers (2025-05-06T10:09:50Z)
ForgetMe: Evaluating Selective Forgetting in Generative Models [4.824120664293887]
We propose an Automatic dataset Creation Framework based on prompt-based layered editing and training-free local feature removal.<n>The ForgetMe dataset encompasses a diverse set of real and synthetic scenarios, including CUB-200-2011 (Birds), Stanford-Dogs, ImageNet, and a synthetic cat dataset.<n>We apply LoRA fine-tuning on Stable Diffusion to achieve selective unlearning on this dataset and validate the effectiveness of both the ForgetMe dataset and the Entangled metric.
arXiv Detail & Related papers (2025-04-17T01:44:57Z)
Generating on Generated: An Approach Towards Self-Evolving Diffusion Models [58.05857658085845]
Recursive Self-Improvement (RSI) enables intelligence systems to autonomously refine their capabilities. This paper explores the application of RSI in text-to-image diffusion models, addressing the challenge of training collapse caused by synthetic data.
arXiv Detail & Related papers (2025-02-14T07:41:47Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy. We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods. By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
Hide in Plain Sight: Clean-Label Backdoor for Auditing Membership Inference [16.893873979953593]
We propose a novel clean-label backdoor-based approach for stealthy data auditing. Our approach employs an optimal trigger generated by a shadow model that mimics target model's behavior. The proposed method enables robust data auditing through blackbox access, achieving high attack success rates across diverse datasets.
arXiv Detail & Related papers (2024-11-24T20:56:18Z)
Optimal Classification under Performative Distribution Shift [13.508249764979075]
We propose a novel view in which performative effects are modelled as push-forward measures. We prove the convexity of the performative risk under a new set of assumptions. We also establish a connection with adversarially robust classification by reformulating the minimization of the performative risk as a min-max variational problem.
arXiv Detail & Related papers (2024-11-04T12:20:13Z)
Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance. Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z)
Detecting and Identifying Selection Structure in Sequential Data [53.24493902162797]
We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. We show that selection structure is identifiable without any parametric assumptions or interventional experiments. We also propose a provably correct algorithm to detect and identify selection structures as well as other types of dependencies.
arXiv Detail & Related papers (2024-06-29T20:56:34Z)
Data AUDIT: Identifying Attribute Utility- and Detectability-Induced Bias in Task Models [8.420252576694583]
We present a first technique for the rigorous, quantitative screening of medical image datasets. Our method decomposes the risks associated with dataset attributes in terms of their detectability and utility. Using our method, we show our screening method reliably identifies nearly imperceptible bias-inducing artifacts.
arXiv Detail & Related papers (2023-04-06T16:50:15Z)
fairlib: A Unified Framework for Assessing and Improving Classification Fairness [66.27822109651757]
fairlib is an open-source framework for assessing and improving classification fairness. We implement 14 debiasing methods, including pre-processing, at-training-time, and post-processing approaches. The built-in metrics cover the most commonly used fairness criterion and can be further generalized and customized for fairness evaluation.
arXiv Detail & Related papers (2022-05-04T03:50:23Z)
On robust risk-based active-learning algorithms for enhanced decision support [0.0]
Classification models are a fundamental component of physical-asset management technologies such as structural health monitoring (SHM) systems and digital twins. The paper proposes two novel approaches to counteract the effects of sampling bias: textitsemi-supervised learning, and textitdiscriminative classification models.
arXiv Detail & Related papers (2022-01-07T17:25:41Z)
Information Theoretic Measures for Fairness-aware Feature Selection [27.06618125828978]
We develop a framework for fairness-aware feature selection, based on information theoretic measures for the accuracy and discriminatory impacts of features. Specifically, our goal is to design a fairness utility score for each feature which quantifies how this feature influences accurate as well as nondiscriminatory decisions.
arXiv Detail & Related papers (2021-06-01T20:11:54Z)
How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production. One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs) We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.