Report on NSF Workshop on Science of Safe AI
- URL: http://arxiv.org/abs/2506.22492v1
- Date: Tue, 24 Jun 2025 18:55:29 GMT
- Title: Report on NSF Workshop on Science of Safe AI
- Authors: Rajeev Alur, Greg Durrett, Hadas Kress-Gazit, Corina Păsăreanu, René Vidal,
- Abstract summary: New advances in machine learning are leading to new opportunities to develop technology-based solutions to societal problems.<n>To fulfill the promise of AI, we must address how to develop AI-based systems that are accurate and performant but also safe and trustworthy.<n>This report is the result of the discussions in the working groups that addressed different aspects of safety at the workshop.
- Score: 75.96202715567088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in machine learning, particularly the emergence of foundation models, are leading to new opportunities to develop technology-based solutions to societal problems. However, the reasoning and inner workings of today's complex AI models are not transparent to the user, and there are no safety guarantees regarding their predictions. Consequently, to fulfill the promise of AI, we must address the following scientific challenge: how to develop AI-based systems that are not only accurate and performant but also safe and trustworthy? The criticality of safe operation is particularly evident for autonomous systems for control and robotics, and was the catalyst for the Safe Learning Enabled Systems (SLES) program at NSF. For the broader class of AI applications, such as users interacting with chatbots and clinicians receiving treatment recommendations, safety is, while no less important, less well-defined with context-dependent interpretations. This motivated the organization of a day-long workshop, held at University of Pennsylvania on February 26, 2025, to bring together investigators funded by the NSF SLES program with a broader pool of researchers studying AI safety. This report is the result of the discussions in the working groups that addressed different aspects of safety at the workshop. The report articulates a new research agenda focused on developing theory, methods, and tools that will provide the foundations of the next generation of AI-enabled systems.
Related papers
- A New Perspective On AI Safety Through Control Theory Methodologies [16.51699616065134]
AI promises to achieve a new level of autonomy but is hampered by a lack of safety assurance.<n>This article outlines a new perspective on AI safety based on an interdisciplinary interpretation of the underlying data-generation process.<n>The new perspective, also referred to as data control, aims to stimulate AI engineering to take advantage of existing safety analysis and assurance.
arXiv Detail & Related papers (2025-06-30T10:26:59Z) - A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety [12.885990679810831]
Open-weight and open-source foundation models are intensifying the obligation to make AI systems safe.<n>This paper reports outcomes from the Columbia Convening on AI Openness and Safety.
arXiv Detail & Related papers (2025-06-27T12:45:44Z) - The Singapore Consensus on Global AI Safety Research Priorities [129.2088011234438]
"2025 Singapore Conference on AI (SCAI): International Scientific Exchange on AI Safety" aimed to support research in this space.<n>Report builds on the International AI Safety Report chaired by Yoshua Bengio and backed by 33 governments.<n>Report organises AI safety research domains into three types: challenges with creating trustworthy AI systems (Development), challenges with evaluating their risks (Assessment) and challenges with monitoring and intervening after deployment (Control)
arXiv Detail & Related papers (2025-06-25T17:59:50Z) - Comparative Analysis of AI-Driven Security Approaches in DevSecOps: Challenges, Solutions, and Future Directions [0.0]
This study conducts a systematic literature review to analyze and compare AI-driven security solutions in DevSecOps.<n>The findings reveal gaps in empirical validation, scalability, and integration of AI in security automation.<n>The study proposes future directions for optimizing AI-based security frameworks in DevSecOps.
arXiv Detail & Related papers (2025-04-27T08:18:11Z) - AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement [73.0700818105842]
We introduce AISafetyLab, a unified framework and toolkit that integrates representative attack, defense, and evaluation methodologies for AI safety.<n> AISafetyLab features an intuitive interface that enables developers to seamlessly apply various techniques.<n>We conduct empirical studies on Vicuna, analyzing different attack and defense strategies to provide valuable insights into their comparative effectiveness.
arXiv Detail & Related papers (2025-02-24T02:11:52Z) - Open Problems in Machine Unlearning for AI Safety [61.43515658834902]
Machine unlearning -- the ability to selectively forget or suppress specific types of knowledge -- has shown promise for privacy and data removal tasks.<n>In this paper, we identify key limitations that prevent unlearning from serving as a comprehensive solution for AI safety.
arXiv Detail & Related papers (2025-01-09T03:59:10Z) - Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.<n>We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - Testing autonomous vehicles and AI: perspectives and challenges from cybersecurity, transparency, robustness and fairness [53.91018508439669]
The study explores the complexities of integrating Artificial Intelligence into Autonomous Vehicles (AVs)
It examines the challenges introduced by AI components and the impact on testing procedures.
The paper identifies significant challenges and suggests future directions for research and development of AI in AV technology.
arXiv Detail & Related papers (2024-02-21T08:29:42Z) - Proceedings of the Robust Artificial Intelligence System Assurance
(RAISA) Workshop 2022 [0.0]
The RAISA workshop will focus on research, development and application of robust artificial intelligence (AI) and machine learning (ML) systems.
Rather than studying robustness with respect to particular ML algorithms, our approach will be to explore robustness assurance at the system architecture level.
arXiv Detail & Related papers (2022-02-10T01:15:50Z) - AAAI FSS-19: Human-Centered AI: Trustworthiness of AI Models and Data
Proceedings [8.445274192818825]
It is crucial for predictive models to be uncertainty-aware and yield trustworthy predictions.
The focus of this symposium was on AI systems to improve data quality and technical robustness and safety.
submissions from broadly defined areas also discussed approaches addressing requirements such as explainable models, human trust and ethical aspects of AI.
arXiv Detail & Related papers (2020-01-15T15:30:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.