Related papers: AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications

AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications

URL: http://arxiv.org/abs/2311.08592v2
Date: Wed, 29 Nov 2023 23:18:16 GMT
Title: AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications
Authors: Bhaktipriya Radharapu, Kevin Robinson, Lora Aroyo, Preethi Lahoti
Abstract summary: Adversarial testing of large language models (LLMs) is crucial for their safe and responsible deployment. We introduce a novel approach for automated generation of adversarial evaluation datasets to test the safety of LLM generations on new downstream applications. We call it AI-assisted Red-Teaming (AART) - an automated alternative to current manual red-teaming efforts.
Score: 5.465142671132731
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial testing of large language models (LLMs) is crucial for their safe and responsible deployment. We introduce a novel approach for automated generation of adversarial evaluation datasets to test the safety of LLM generations on new downstream applications. We call it AI-assisted Red-Teaming (AART) - an automated alternative to current manual red-teaming efforts. AART offers a data generation and augmentation pipeline of reusable and customizable recipes that reduce human effort significantly and enable integration of adversarial testing earlier in new product development. AART generates evaluation datasets with high diversity of content characteristics critical for effective adversarial testing (e.g. sensitive and harmful concepts, specific to a wide range of cultural and geographic regions and application scenarios). The data generation is steered by AI-assisted recipes to define, scope and prioritize diversity within the application context. This feeds into a structured LLM-generation process that scales up evaluation priorities. Compared to some state-of-the-art tools, AART shows promising results in terms of concept coverage and data quality.

Related papers

A Systematic Review of Key Retrieval-Augmented Generation (RAG) Systems: Progress, Gaps, and Future Directions [1.4931265249949528]
Retrieval-Augmented Generation (RAG) is a major advancement in natural language processing (NLP)<n>RAG combines large language models (LLMs) with information retrieval systems to enhance factual grounding, accuracy, and contextual relevance.<n>This paper presents a systematic review of RAG, tracing its evolution from early developments in open domain question answering to recent state-of-the-art implementations.
arXiv Detail & Related papers (2025-07-25T03:05:46Z)
Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z)
Multimodal Generative AI for Story Point Estimation in Software Development [0.9831489366502301]
This research explores the application of Multimodal Generative AI to enhance story point estimation in Agile software development.<n>By integrating text, image, and categorical data using advanced models like BERT, CNN, and XGBoost, our approach surpasses the limitations of traditional single-modal estimation methods.
arXiv Detail & Related papers (2025-05-22T06:40:41Z)
AI-GenBench: A New Ongoing Benchmark for AI-Generated Image Detection [9.540547388707987]
Ai-GenBench is a novel benchmark designed to address the need for robust detection of AI-generated images in real-world scenarios. By establishing clear evaluation rules and controlled augmentation strategies, Ai-GenBench enables meaningful comparison of detection methods and scalable solutions.
arXiv Detail & Related papers (2025-04-29T15:41:13Z)
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation [8.950307082012763]
Retrieval-Augmented Generation (RAG) has gained prominence as an effective method for enhancing the generative capabilities of Large Language Models (LLMs) We present MIRAGE, a Question Answering dataset specifically designed for RAG evaluation. MIRAGE consists of 7,560 curated instances mapped to a retrieval pool of 37,800 entries, enabling an efficient and precise evaluation of both retrieval and generation tasks.
arXiv Detail & Related papers (2025-04-23T23:05:46Z)
Movie2Story: A framework for understanding videos and telling stories in the form of novel text [0.0]
We propose a novel benchmark to evaluate text generation capabilities in scenarios enriched with auxiliary information. Our work introduces an innovative automatic dataset generation method to ensure the availability of accurate auxiliary information. Our experiments reveal that current Multi-modal Large Language Models (MLLMs) perform suboptimally under the proposed evaluation metrics.
arXiv Detail & Related papers (2024-12-19T15:44:04Z)
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts [0.0]
A huge number of detectors and collections with AI fragments have emerged, and several detection methods even showed recognition quality up to 99.9%. Are detectors actually highly trustworthy or do their high benchmark scores come from the poor quality of evaluation datasets? We present a systematic review of datasets from competitions dedicated to AI-generated content detection and propose methods for evaluating the quality of datasets containing AI-generated fragments.
arXiv Detail & Related papers (2024-10-18T17:59:57Z)
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training. We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios. Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z)
IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation [15.895295957106772]
We propose an ID-induced prompt synthesis framework for evaluating Large Language Models (LLMs) Our data synthesis framework prioritizes both breadth and specificity. It can generate prompts that comprehensively evaluate the capabilities of LLMs. We will release a dataset of over 3,000 carefully crafted prompts to facilitate evaluation research of LLMs.
arXiv Detail & Related papers (2024-09-27T16:29:12Z)
Generative LLM Powered Conversational AI Application for Personalized Risk Assessment: A Case Study in COVID-19 [6.367429891237191]
Large language models (LLMs) have shown remarkable capabilities in various natural language tasks. This work demonstrates a new LLM-powered disease risk assessment approach via streaming human-AI conversation.
arXiv Detail & Related papers (2024-09-23T13:55:13Z)
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios. With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance. Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z)
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models [71.25225058845324]
Large Language Models (LLMs) have demonstrated revolutionary abilities in language understanding and generation. Retrieval-Augmented Generation (RAG) can offer reliable and up-to-date external knowledge. RA-LLMs have emerged to harness external and authoritative knowledge bases, rather than relying on the model's internal knowledge.
arXiv Detail & Related papers (2024-05-10T02:48:45Z)
Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey [2.716339075963185]
Recent advancements in deep learning (DL) have posed a significant challenge for automatic speech recognition (ASR) ASR relies on extensive training datasets, including confidential ones, and demands substantial computational and storage resources. Advanced DL techniques like deep transfer learning (DTL), federated learning (FL), and reinforcement learning (RL) address these issues.
arXiv Detail & Related papers (2024-03-02T16:25:42Z)
Retrieval-Augmented Generation for AI-Generated Content: A Survey [38.50754568320154]
Retrieval-Augmented Generation (RAG) has emerged as a paradigm to address such challenges. RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores. In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios.
arXiv Detail & Related papers (2024-02-29T18:59:01Z)
Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z)
On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training [109.9218185711916]
Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind social media texts or reviews. We propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training.
arXiv Detail & Related papers (2023-04-19T11:07:43Z)
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences. Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.