Related papers: Generative AI in Systems Engineering: A Framework for Risk Assessment of Large Language Models

Generative AI in Systems Engineering: A Framework for Risk Assessment of Large Language Models

URL: http://arxiv.org/abs/2602.04358v1
Date: Wed, 04 Feb 2026 09:30:11 GMT
Title: Generative AI in Systems Engineering: A Framework for Risk Assessment of Large Language Models
Authors: Stefan Otten, Philipp Reis, Philipp Rigoll, Joshua Ransiek, Tobias Schürmann, Jacob Langner, Eric Sax,
Abstract summary: The increasing use of Large Language Models (LLMs) offers significant opportunities across the engineering lifecycle.<n>This paper introduces the LLM Risk Assessment Framework (LRF), a structured approach for evaluating the application of LLMs within Systems Engineering environments.
Score: 0.8062120534124607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing use of Large Language Models (LLMs) offers significant opportunities across the engineering lifecycle, including requirements engineering, software development, process optimization, and decision support. Despite this potential, organizations face substantial challenges in assessing the risks associated with LLM use, resulting in inconsistent integration, unknown failure modes, and limited scalability. This paper introduces the LLM Risk Assessment Framework (LRF), a structured approach for evaluating the application of LLMs within Systems Engineering (SE) environments. The framework classifies LLM-based applications along two fundamental dimensions: autonomy, ranging from supportive assistance to fully automated decision making, and impact, reflecting the potential severity of incorrect or misleading model outputs on engineering processes and system elements. By combining these dimensions, the LRF enables consistent determination of corresponding risk levels across the development lifecycle. The resulting classification supports organizations in identifying appropriate validation strategies, levels of human oversight, and required countermeasures to ensure safe and transparent deployment. The framework thereby helps align the rapid evolution of AI technologies with established engineering principles of reliability, traceability, and controlled process integration. Overall, the LRF provides a basis for risk-aware adoption of LLMs in complex engineering environments and represents a first step toward standardized AI assurance practices in systems engineering.

Related papers

Failure Modes in LLM Systems: A System-Level Taxonomy for Reliable AI Applications [0.0]
Large language models (LLMs) are being rapidly integrated into decision-support tools, automation, and AI-enabled software systems.<n>This paper presents a system-level taxonomy of fifteen hidden failure modes that arise in real-world LLM applications.
arXiv Detail & Related papers (2025-11-25T05:19:23Z)
A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System [56.40989626804489]
This survey provides the first holistic analysis of Large Language Models-powered software engineering.<n>We review over 150 recent papers and propose a taxonomy along two key dimensions: (1) Solutions, categorized into prompt-based, fine-tuning-based, and agent-based paradigms, and (2) Benchmarks, including tasks such as code generation, translation, and repair.
arXiv Detail & Related papers (2025-10-10T06:56:50Z)
From nuclear safety to LLM security: Applying non-probabilistic risk management strategies to build safe and secure LLM-powered systems [49.1574468325115]
Large language models (LLMs) offer unprecedented and growing capabilities, but also introduce complex safety and security challenges.<n>Previous research found that risk management in various fields of engineering such as nuclear or civil engineering is often solved by generic (i.e. field-agnostic) strategies.<n>Here we show how emerging risks in LLM-powered systems could be met with 100+ of these non-probabilistic strategies to risk management.
arXiv Detail & Related papers (2025-05-20T16:07:41Z)
ASIL-Decomposition Based Resource Allocation Optimization for Automotive E/E Architectures [0.4143603294943439]
We present an approach to automatically map software components to available hardware resources.<n>Compared to existing frameworks, our method provides a wider range of safety analyses in compliance with the ISO 26262 standard.<n>We formulate a multi-objective optimization problem to minimize both the development cost and the maximum execution times of critical function chains.
arXiv Detail & Related papers (2025-05-10T15:48:29Z)
An LLM-enabled Multi-Agent Autonomous Mechatronics Design Framework [49.633199780510864]
This work proposes a multi-agent autonomous mechatronics design framework, integrating expertise across mechanical design, optimization, electronics, and software engineering.<n> operating primarily through a language-driven workflow, the framework incorporates structured human feedback to ensure robust performance under real-world constraints.<n>A fully functional autonomous vessel was developed with optimized propulsion, cost-effective electronics, and advanced control.
arXiv Detail & Related papers (2025-04-20T16:57:45Z)
Safe LLM-Controlled Robots with Formal Guarantees via Reachability Analysis [0.6749750044497732]
This paper introduces a safety assurance framework for Large Language Models (LLMs)-controlled robots based on data-driven reachability analysis.<n>Our approach provides rigorous safety guarantees against unsafe behaviors without relying on explicit analytical models.
arXiv Detail & Related papers (2025-03-05T21:23:15Z)
EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress. Our investigation exposes a critical oversight in this belief. By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z)
Concept-Guided LLM Agents for Human-AI Safety Codesign [6.603483691167379]
Generative AI is increasingly important in software engineering, including safety engineering, where its use ensures that software does not cause harm to people. It is crucial to develop more advanced and sophisticated approaches that can effectively address the complexities and safety concerns of software systems. We present an efficient, hybrid strategy to leverage Large Language Models for safety analysis and Human-AI codesign.
arXiv Detail & Related papers (2024-04-03T11:37:01Z)
Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal [0.0]
We propose a risk assessment process using tools like the risk rating methodology which is used for traditional systems. We conduct scenario analysis to identify potential threat agents and map the dependent system components against vulnerability factors. We also map threats against three key stakeholder groups.
arXiv Detail & Related papers (2024-03-20T05:17:22Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results. We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.