Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
- URL: http://arxiv.org/abs/2407.12858v1
- Date: Wed, 10 Jul 2024 01:23:10 GMT
- Title: Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
- Authors: Krishnaram Kenthapadi, Mehrnoosh Sameki, Ankur Taly,
- Abstract summary: It is essential to evaluate and monitor AI systems for robustness, bias, security, interpretability, and other responsible AI dimensions.
We focus on large language models (LLMs) and other generative AI models, which present additional challenges such as hallucinations, harmful and manipulative content, and copyright infringement.
- Score: 16.39412083123155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the ongoing rapid adoption of Artificial Intelligence (AI)-based systems in high-stakes domains, ensuring the trustworthiness, safety, and observability of these systems has become crucial. It is essential to evaluate and monitor AI systems not only for accuracy and quality-related metrics but also for robustness, bias, security, interpretability, and other responsible AI dimensions. We focus on large language models (LLMs) and other generative AI models, which present additional challenges such as hallucinations, harmful and manipulative content, and copyright infringement. In this survey article accompanying our KDD 2024 tutorial, we highlight a wide range of harms associated with generative AI systems, and survey state of the art approaches (along with open challenges) to address these harms.
Related papers
- Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks [22.154001025679896]
Embodied AI systems, including robots and autonomous vehicles, are increasingly integrated into real-world applications.
These vulnerabilities manifest through sensor spoofing, adversarial attacks, and failures in task and motion planning.
arXiv Detail & Related papers (2025-02-18T03:38:07Z) - Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI.
We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts.
We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z) - Safety at Scale: A Comprehensive Survey of Large Model Safety [299.801463557549]
We present a comprehensive taxonomy of safety threats to large models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats.
We identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices.
arXiv Detail & Related papers (2025-02-02T05:14:22Z) - Open Problems in Machine Unlearning for AI Safety [61.43515658834902]
Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks.
In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety.
arXiv Detail & Related papers (2025-01-09T03:59:10Z) - AI Benchmarks and Datasets for LLM Evaluation [0.46960837342692324]
The EU AI Act citeEUAIAct by the European Parliament on March 13, 2024, establishes the first comprehensive EU-wide requirements for the development, deployment, and use of AI systems.
It highlights the need to enrich this methodology with practical benchmarks to effectively address the technical challenges posed by AI systems.
We have launched a project that is part of the AI Safety Bulgaria initiatives citeAI_Safety_Bulgaria, aimed at collecting and categorizing AI benchmarks.
arXiv Detail & Related papers (2024-12-02T00:38:57Z) - Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI.
The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees.
We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z) - The Alignment Problem in Context [0.05657375260432172]
I assess whether we are on track to solve the alignment problem for large language models.
I argue that existing strategies for alignment are insufficient, because large language models remain vulnerable to adversarial attacks.
It follows that the alignment problem is not only unsolved for current AI systems, but may be intrinsically difficult to solve without severely undermining their capabilities.
arXiv Detail & Related papers (2023-11-03T17:57:55Z) - Building Safe and Reliable AI systems for Safety Critical Tasks with
Vision-Language Processing [1.2183405753834557]
Current AI algorithms are unable to identify common causes for failure detection.
Additional techniques are required to quantify the quality of predictions.
This thesis will focus on vision-language data processing for tasks like classification, image captioning, and vision question answering.
arXiv Detail & Related papers (2023-08-06T18:05:59Z) - AI Maintenance: A Robustness Perspective [91.28724422822003]
We introduce highlighted robustness challenges in the AI lifecycle and motivate AI maintenance by making analogies to car maintenance.
We propose an AI model inspection framework to detect and mitigate robustness risks.
Our proposal for AI maintenance facilitates robustness assessment, status tracking, risk scanning, model hardening, and regulation throughout the AI lifecycle.
arXiv Detail & Related papers (2023-01-08T15:02:38Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z) - AAAI FSS-19: Human-Centered AI: Trustworthiness of AI Models and Data
Proceedings [8.445274192818825]
It is crucial for predictive models to be uncertainty-aware and yield trustworthy predictions.
The focus of this symposium was on AI systems to improve data quality and technical robustness and safety.
submissions from broadly defined areas also discussed approaches addressing requirements such as explainable models, human trust and ethical aspects of AI.
arXiv Detail & Related papers (2020-01-15T15:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.