Related papers: PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing

URL: http://arxiv.org/abs/2505.21184v1
Date: Tue, 27 May 2025 13:33:57 GMT
Title: PoisonSwarm: Universal Harmful Information Synthesis via Model Crowdsourcing
Authors: Yu Yan, Sheng Sun, Zhifei Zheng, Ziji Hao, Teli Liu, Min Liu,
Abstract summary: We propose PoisonSwarm, which applies the model crowdsourcing strategy to generate diverse harmful data.<n>We decompose each based template into multiple semantic units and perform unit-by-unit toxification.<n>Experiments demonstrate that PoisonSwarm achieves state-of-the-art performance in synthesizing different categories of harmful data.
Score: 7.760708840164335
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: To construct responsible and secure AI applications, harmful information data is widely utilized for adversarial testing and the development of safeguards. Existing studies mainly leverage Large Language Models (LLMs) to synthesize data to obtain high-quality task datasets at scale, thereby avoiding costly human annotation. However, limited by the safety alignment mechanisms of LLMs, the synthesis of harmful data still faces challenges in generation reliability and content diversity. In this study, we propose a novel harmful information synthesis framework, PoisonSwarm, which applies the model crowdsourcing strategy to generate diverse harmful data while maintaining a high success rate. Specifically, we generate abundant benign data as the based templates in a counterfactual manner. Subsequently, we decompose each based template into multiple semantic units and perform unit-by-unit toxification and final refinement through dynamic model switching, thus ensuring the success of synthesis. Experimental results demonstrate that PoisonSwarm achieves state-of-the-art performance in synthesizing different categories of harmful data with high scalability and diversity.

Related papers

Lightweight Safety Guardrails via Synthetic Data and RL-guided Adversarial Training [0.1533068702686808]
Small-scale language models can achieve, and even surpass, the performance of larger counterparts in content moderation tasks.<n>This is accomplished through high-fidelity synthetic data generation and adversarial training.
arXiv Detail & Related papers (2025-07-11T03:17:58Z)
Scaling Laws of Synthetic Data for Language Models [132.67350443447611]
We introduce SynthLLM, a scalable framework that transforms pre-training corpora into diverse, high-quality synthetic datasets.<n>Our approach achieves this by automatically extracting and recombining high-level concepts across multiple documents using a graph algorithm.
arXiv Detail & Related papers (2025-03-25T11:07:12Z)
ToxiLab: How Well Do Open-Source LLMs Generate Synthetic Toxicity Data? [29.23490658406256]
This study explores the potential of open-source LLMs for harmful data synthesis.<n>We evaluate their ability to generate diverse, high-quality harmful data while minimizing hallucination and duplication.<n>Our findings demonstrate that fine-tuned open source LLMs provide scalable and cost-effective solutions to augment toxic content detection datasets.
arXiv Detail & Related papers (2024-11-18T00:21:14Z)
Little Giants: Synthesizing High-Quality Embedding Data at Scale [71.352883755806]
We introduce SPEED, a framework that aligns open-source small models to efficiently generate large-scale embedding data. SPEED uses only less than 1/10 of the GPT API calls, outperforming the state-of-the-art embedding model E5_mistral when both are trained solely on their synthetic data.
arXiv Detail & Related papers (2024-10-24T10:47:30Z)
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs) Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws. Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z)
Best Practices and Lessons Learned on Synthetic Data [83.63271573197026]
The success of AI models relies on the availability of large, diverse, and high-quality datasets. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns.
arXiv Detail & Related papers (2024-04-11T06:34:17Z)
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models [69.76066070227452]
*Data Synthesis* is a promising way to train a small model with very little labeled data. We propose *Synthesis Step by Step* (**S3**), a data synthesis framework that shrinks this distribution gap. Our approach improves the performance of a small model by reducing the gap between the synthetic dataset and the real data.
arXiv Detail & Related papers (2023-10-20T17:14:25Z)
Does Synthetic Data Make Large Language Models More Efficient? [0.0]
This paper explores the nuances of synthetic data generation in NLP. We highlight its advantages, including data augmentation potential and the introduction of structured variety. We demonstrate the impact of template-based synthetic data on the performance of modern transformer models.
arXiv Detail & Related papers (2023-10-11T19:16:09Z)
CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular Data Synthesis [0.4999814847776097]
Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data. The validity of the synthetic data and the underlying privacy concerns represent major challenges which are not sufficiently addressed.
arXiv Detail & Related papers (2023-07-01T16:52:18Z)
Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task. We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z)
Hybrid Deep Learning Model using SPCAGAN Augmentation for Insider Threat Analysis [7.576808824987132]
Anomaly detection using deep learning requires comprehensive data, but insider threat data is not readily available due to confidentiality concerns. We propose a linear manifold learning-based generative adversarial network, SPCAGAN, that takes input from heterogeneous data sources. We show that our proposed approach has a lower error, is more accurate, and generates substantially superior synthetic insider threat data than previous models.
arXiv Detail & Related papers (2022-03-06T02:08:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.