GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection
- URL: http://arxiv.org/abs/2508.17057v1
- Date: Sat, 23 Aug 2025 15:09:58 GMT
- Title: GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection
- Authors: Melissa Kazemi Rad, Alberto Purpura, Himanshu Kumar, Emily Chen, Mohammad Shahed Sorower,
- Abstract summary: We introduce GRAID (Geometric and Reflective AI-Driven Data Augmentation), a novel pipeline for dataset augmentation.<n>GRAID consists of two stages: (i) generation of geometrically controlled examples using a constrained LLM, and (ii) augmentation through a multi-agentic reflective process.<n>We demonstrate that augmenting a harmful text classification dataset with GRAID leads to significant improvements in downstream guardrail model performance.
- Score: 4.61489054791777
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We address the problem of data scarcity in harmful text classification for guardrailing applications and introduce GRAID (Geometric and Reflective AI-Driven Data Augmentation), a novel pipeline that leverages Large Language Models (LLMs) for dataset augmentation. GRAID consists of two stages: (i) generation of geometrically controlled examples using a constrained LLM, and (ii) augmentation through a multi-agentic reflective process that promotes stylistic diversity and uncovers edge cases. This combination enables both reliable coverage of the input space and nuanced exploration of harmful content. Using two benchmark data sets, we demonstrate that augmenting a harmful text classification dataset with GRAID leads to significant improvements in downstream guardrail model performance.
Related papers
- When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation [7.12229180415536]
Large Language Models (LLMs) have recently demonstrated remarkable performance in generating high-quality synthetic data.<n>We show that popular implementations exhibit a tendency to compromise privacy by reproducing memorized patterns of numeric digits from their training data.<n>We propose two methods, including a novel sampling strategy that strategically perturbs digits during generation.
arXiv Detail & Related papers (2025-12-09T18:06:31Z) - Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation [20.674323995662366]
Retrieval-Augmented Generation (RAG) has emerged as a widely adopted approach for knowledge injection during large language model (LLM) inference in recent years.<n>Due to their limited ability to exploit fine-grained inter-document relationships, current RAG implementations face challenges in effectively addressing the retrieved noise and redundancy content.<n>We propose an Efficient Dynamic Clustering-based document Compression framework (EDC2-RAG) that utilizes latent inter-document relationships while simultaneously removing irrelevant information and redundant content.
arXiv Detail & Related papers (2025-04-04T04:43:13Z) - Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation [66.66243874361103]
dataset generation faces two key challenges: 1) aligning generated samples with the target domain and 2) producing informative samples beyond the training data.<n>We propose Concept-Aware LoRA, a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts for domain alignment.<n>We demonstrate its effectiveness in generating datasets for urban-scene segmentation, outperforming baseline and state-of-the-art methods in in-domain settings.
arXiv Detail & Related papers (2025-03-28T06:23:29Z) - Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation [63.54377402784965]
We propose a Rewriting-driven AugMentation (RAM) paradigm for Vision-Language Navigation (VLN)<n>Benefiting from our rewriting mechanism, new observation-instruction pairs can be obtained in both simulator-free and labor-saving manners.<n> Experiments on both the discrete environments (R2R, REVERIE, and R4R dataset) and continuous environments (R2R-CE dataset) show the superior performance and impressive generalization ability of our method.
arXiv Detail & Related papers (2025-03-23T13:18:17Z) - Enhancing Unsupervised Sentence Embeddings via Knowledge-Driven Data Augmentation and Gaussian-Decayed Contrastive Learning [37.54523122932728]
We propose a pipeline-based data augmentation method via large language models (LLMs)<n>We introduce the Gaussian-decayed gradient-assisted Contrastive Sentence Embedding (GCSE) model to enhance unsupervised sentence embeddings.<n> Experimental results show that our approach achieves state-of-the-art performance in semantic textual similarity tasks.
arXiv Detail & Related papers (2024-09-19T16:29:58Z) - A Framework for Fine-Tuning LLMs using Heterogeneous Feedback [69.51729152929413]
We present a framework for fine-tuning large language models (LLMs) using heterogeneous feedback.
First, we combine the heterogeneous feedback data into a single supervision format, compatible with methods like SFT and RLHF.
Next, given this unified feedback dataset, we extract a high-quality and diverse subset to obtain performance increases.
arXiv Detail & Related papers (2024-08-05T23:20:32Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.<n>Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - Virtual Data Augmentation: A Robust and General Framework for
Fine-tuning Pre-trained Models [51.46732511844122]
Powerful pre-trained language models (PLM) can be fooled by small perturbations or intentional attacks.
We present Virtual Data Augmentation (VDA), a general framework for robustly fine-tuning PLMs.
Our approach is able to improve the robustness of PLMs and alleviate the performance degradation under adversarial attacks.
arXiv Detail & Related papers (2021-09-13T09:15:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.