Related papers: C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

URL: http://arxiv.org/abs/2402.03181v5
Date: Tue, 30 Jul 2024 02:47:47 GMT
Title: C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models
Authors: Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li,
Abstract summary: We propose C-RAG, the first framework to certify generation risks for RAG models. Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks. We prove that RAG achieves a lower conformal generation risk than that of a single LLM when the quality of the retrieval model and transformer is non-trivial.
Score: 57.10361282229501
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the impressive capabilities of large language models (LLMs) across diverse applications, they still suffer from trustworthiness issues, such as hallucinations and misalignments. Retrieval-augmented language models (RAG) have been proposed to enhance the credibility of generations by grounding external knowledge, but the theoretical understandings of their generation risks remains unexplored. In this paper, we answer: 1) whether RAG can indeed lead to low generation risks, 2) how to provide provable guarantees on the generation risks of RAG and vanilla LLMs, and 3) what sufficient conditions enable RAG models to reduce generation risks. We propose C-RAG, the first framework to certify generation risks for RAG models. Specifically, we provide conformal risk analysis for RAG models and certify an upper confidence bound of generation risks, which we refer to as conformal generation risk. We also provide theoretical guarantees on conformal generation risks for general bounded risk functions under test distribution shifts. We prove that RAG achieves a lower conformal generation risk than that of a single LLM when the quality of the retrieval model and transformer is non-trivial. Our intensive empirical results demonstrate the soundness and tightness of our conformal generation risk guarantees across four widely-used NLP datasets on four state-of-the-art retrieval models.

Related papers

SoK: Privacy Risks and Mitigations in Retrieval-Augmented Generation Systems [53.51921540246166]
Retrieval-Augmented Generation (RAG) techniques have become widely popular.<n>RAG involves the coupling of Large Language Models (LLMs) with domain-specific knowledge bases.<n>The proliferation of RAG has sparked concerns about data privacy.
arXiv Detail & Related papers (2026-01-07T14:50:41Z)
The Role of Risk Modeling in Advanced AI Risk Management [33.357295564462284]
Rapidly advancing artificial intelligence (AI) systems introduce novel, uncertain, and potentially catastrophic risks.<n>Managing these risks requires a mature risk-management infrastructure whose cornerstone is rigorous risk modeling.<n>We argue that advanced-AI governance should adopt a similar dual approach and that verifiable, provably-safe AI architectures are urgently needed.
arXiv Detail & Related papers (2025-12-09T15:37:33Z)
Exploring the Secondary Risks of Large Language Models [17.845215420030467]
We introduce secondary risks marked by harmful or misleading behaviors during benign prompts.<n>Unlike adversarial attacks, these risks stem from imperfect generalization and often evade standard safety mechanisms.<n>We propose SecLens, a black-box, multi-objective search framework that efficiently elicits secondary risk behaviors.
arXiv Detail & Related papers (2025-06-14T07:31:52Z)
Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.36487127683053]
Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC) RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks. Despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including privacy concerns, adversarial attacks, and accountability issues.
arXiv Detail & Related papers (2025-02-08T06:50:47Z)
Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework [77.45983464131977]
We focus on how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications. Our research identifies two critical latent factors affecting RAG's confidence in its predictions. We develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers.
arXiv Detail & Related papers (2024-09-24T14:52:14Z)
Defining and Evaluating Decision and Composite Risk in Language Models Applied to Natural Language Inference [3.422309388045878]
Large language models (LLMs) such as ChatGPT are known to pose important risks. misplaced confidence arises from over-confidence or under-confidence, that the models have in their inference. We propose an experimental framework consisting of a two-level inference architecture and appropriate metrics for measuring such risks.
arXiv Detail & Related papers (2024-08-04T05:24:32Z)
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios. With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance. Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z)
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models [46.93425758722059]
CRiskEval is a Chinese dataset meticulously designed for gauging the risk proclivities inherent in large language models (LLMs) We define a new risk taxonomy with 7 types of frontier risks and 4 safety levels, including extremely hazardous,moderately hazardous, neutral and safe. The dataset consists of 14,888 questions that simulate scenarios related to predefined 7 types of frontier risks.
arXiv Detail & Related papers (2024-06-07T08:52:24Z)
RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization [49.26510528455664]
We introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles. We show that RiskQ can obtain promising performance through extensive experiments.
arXiv Detail & Related papers (2023-11-03T07:18:36Z)
A Formalism and Approach for Improving Robustness of Large Language Models Using Risk-Adjusted Confidence Scores [4.043005183192123]
Large Language Models (LLMs) have achieved impressive milestones in natural language processing (NLP) Despite their impressive performance, the models are known to pose important risks. We define and formalize two distinct types of risk: decision risk and composite risk.
arXiv Detail & Related papers (2023-10-05T03:20:41Z)
Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.