Related papers: World Models: The Safety Perspective

World Models: The Safety Perspective

URL: http://arxiv.org/abs/2411.07690v1
Date: Tue, 12 Nov 2024 10:15:11 GMT
Title: World Models: The Safety Perspective
Authors: Zifan Zeng, Chongzhe Zhang, Feng Liu, Joseph Sifakis, Qunli Zhang, Shiming Liu, Peng Wang,
Abstract summary: The concept of World Models (WM) has recently attracted a great deal of attention in the AI research community. We provide an in-depth analysis of state-of-the-art WMs and their impact in order to call on the research community to collaborate on improving the safety and trustworthiness of WM.
Score: 6.520366712367809
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the proliferation of the Large Language Model (LLM), the concept of World Models (WM) has recently attracted a great deal of attention in the AI research community, especially in the context of AI agents. It is arguably evolving into an essential foundation for building AI agent systems. A WM is intended to help the agent predict the future evolution of environmental states or help the agent fill in missing information so that it can plan its actions and behave safely. The safety property of WM plays a key role in their effective use in critical applications. In this work, we review and analyze the impacts of the current state-of-the-art in WM technology from the point of view of trustworthiness and safety based on a comprehensive survey and the fields of application envisaged. We provide an in-depth analysis of state-of-the-art WMs and derive technical research challenges and their impact in order to call on the research community to collaborate on improving the safety and trustworthiness of WM.

Related papers

Enterprise-Grade Security for the Model Context Protocol (MCP): Frameworks and Mitigation Strategies [0.0]
The Model Context Protocol (MCP) provides a standardized framework for artificial intelligence (AI) systems to interact with external data sources and tools in real-time. This paper builds upon foundational research into MCP architecture and preliminary security assessments to deliver enterprise-grade mitigation frameworks.
arXiv Detail & Related papers (2025-04-11T15:25:58Z)
Safety at Scale: A Comprehensive Survey of Large Model Safety [298.05093528230753]
We present a comprehensive taxonomy of safety threats to large models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats. We identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices.
arXiv Detail & Related papers (2025-02-02T05:14:22Z)
On Large Language Models in Mission-Critical IT Governance: Are We Ready Yet? [7.098487130130114]
Security of critical infrastructure has been a pressing concern since the advent of computers. Recent events reveal the increasing difficulty of meeting these challenges. We aim to explore practitioners' views on integrating Generative AI into the governance of IT MCSs.
arXiv Detail & Related papers (2024-12-16T12:21:05Z)
Large Model Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends [25.029148345440902]
Large Model (LM) agents, powered by large foundation models such as GPT-4 and DALL-E 2, represent a significant step towards achieving Artificial General Intelligence (AGI) This paper provides a comprehensive survey of the state-of-the-art in LM agents, focusing on the architecture, cooperation paradigms, security, privacy, and future prospects.
arXiv Detail & Related papers (2024-09-22T14:09:49Z)
Recent Advances in Attack and Defense Approaches of Large Language Models [27.271665614205034]
Large Language Models (LLMs) have revolutionized artificial intelligence and machine learning through their advanced text processing and generating capabilities. Their widespread deployment has raised significant safety and reliability concerns. This paper reviews current research on LLM vulnerabilities and threats, and evaluates the effectiveness of contemporary defense mechanisms.
arXiv Detail & Related papers (2024-09-05T06:31:37Z)
EAIRiskBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [47.69642609574771]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction. Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results. However, the deployment of these agents in physical environments presents significant safety challenges. This study introduces EAIRiskBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context. We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z)
Security of AI Agents [5.468745160706382]
The study and development of AI agents have been boosted by large language models. In this paper, we identify and describe these vulnerabilities in detail from a system security perspective. We introduce defense mechanisms corresponding to each vulnerability with meticulous design and experiments to evaluate their viability.
arXiv Detail & Related papers (2024-06-12T23:16:45Z)
Position Paper: Agent AI Towards a Holistic Intelligence [53.35971598180146]
We emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions. In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model.
arXiv Detail & Related papers (2024-02-28T16:09:56Z)
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [65.77763092833348]
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, these agents also introduce novel vulnerabilities that demand careful consideration for safety. This paper conducts a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures.
arXiv Detail & Related papers (2024-02-06T18:54:07Z)
The Last Decade in Review: Tracing the Evolution of Safety Assurance Cases through a Comprehensive Bibliometric Analysis [7.431812376079826]
Safety assurance is of paramount importance across various domains, including automotive, aerospace, and nuclear energy. The use of safety assurance cases allows for verifying the correctness of the created systems capabilities, preventing system failure.
arXiv Detail & Related papers (2023-11-13T17:34:23Z)
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements [76.80453043969209]
This survey presents a framework for safety research pertaining to large models. We begin by introducing safety issues of wide concern, then delve into safety evaluation methods for large models. We explore the strategies for enhancing large model safety from training to deployment.
arXiv Detail & Related papers (2023-02-18T09:32:55Z)
Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper [50.25428141435537]
Artificial Intelligence for IT Operations (AIOps) is an emerging interdisciplinary field arising in the intersection between machine learning, big data, streaming analytics, and the management of IT operations. Main aim of the AIOPS workshop is to bring together researchers from both academia and industry to present their experiences, results, and work in progress in this field.
arXiv Detail & Related papers (2021-01-15T10:43:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.