Related papers: An Early-Stage Workflow Proposal for the Generation of Safe and Dependable AI Classifiers

An Early-Stage Workflow Proposal for the Generation of Safe and Dependable AI Classifiers

URL: http://arxiv.org/abs/2410.01850v1
Date: Tue, 1 Oct 2024 14:00:53 GMT
Title: An Early-Stage Workflow Proposal for the Generation of Safe and Dependable AI Classifiers
Authors: Hans Dermot Doran, Suzana Veljanovska,
Abstract summary: The generation and execution of qualifiable safe and dependable AI models, necessitates definition of a transparent, complete yet adaptable and preferably lightweight workflow. This early-stage work proposes such a workflow basing it on a an extended ONNX model description. A use case provides one foundations of this body of work which we expect to be extended by other, third party use-cases.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The generation and execution of qualifiable safe and dependable AI models, necessitates definition of a transparent, complete yet adaptable and preferably lightweight workflow. Given the rapidly progressing domain of AI research and the relative immaturity of the safe-AI domain the process stability upon which functionally safety developments rest must be married with some degree of adaptability. This early-stage work proposes such a workflow basing it on a an extended ONNX model description. A use case provides one foundations of this body of work which we expect to be extended by other, third party use-cases.

Related papers

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law [91.33824439029533]
We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety.<n>It is developed by our proposed SafeLadder framework, which incorporates large-scale, progressive, safety-oriented reinforcement learning post-training.<n>We further develop SafeWork-R1-InternVL3-78B, SafeWork-R1-DeepSeek-70B, and SafeWork-R1-Qwen2.5VL-7B.
arXiv Detail & Related papers (2025-07-24T16:49:19Z)
A Domain-Agnostic Scalable AI Safety Ensuring Framework [8.086635708001166]
We propose a novel framework that guarantees AI systems satisfy user-defined safety constraints with specified probabilities.<n>Our approach combines any AI model with an optimization problem that ensures outputs meet safety requirements while maintaining performance.<n>We prove our method guarantees probabilistic safety under mild conditions and establish the first scaling law in AI safety.
arXiv Detail & Related papers (2025-04-29T16:38:35Z)
Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation [52.83870601473094]
Embodied agents exhibit immense potential across a multitude of domains. Existing research predominantly concentrates on the security of general large language models. This paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents.
arXiv Detail & Related papers (2025-04-22T08:34:35Z)
Workflow for Safe-AI [0.0]
Development and deployment of safe and dependable AI models is crucial in applications where functional safety is a key concern. This work proposes a transparent, complete, yet flexible and lightweight workflow that highlights both reliability and qualifiability.
arXiv Detail & Related papers (2025-03-18T07:45:18Z)
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Safe Reinforcement Learning [10.844235123282056]
We propose SafeVLA, a novel algorithm designed to integrate safety into vision-language--action models (VLAs) SafeVLA balances safety and task performance by employing large-scale constrained learning within simulated environments. We demonstrate that SafeVLA outperforms the current state-of-the-art method in both safety and task performance.
arXiv Detail & Related papers (2025-03-05T13:16:55Z)
Model Developmental Safety: A Safety-Centric Method and Applications in Vision-Language Models [75.8161094916476]
We study how to develop a pretrained vision-language model (aka the CLIP model) for acquiring new capabilities or improving existing capabilities of image classification. Our experiments on improving vision perception capabilities on autonomous driving and scene recognition datasets demonstrate the efficacy of the proposed approach.
arXiv Detail & Related papers (2024-10-04T22:34:58Z)
What Makes and Breaks Safety Fine-tuning? A Mechanistic Study [64.9691741899956]
Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment. We design a synthetic data generation framework that captures salient aspects of an unsafe input. Using this, we investigate three well-known safety fine-tuning methods.
arXiv Detail & Related papers (2024-07-14T16:12:57Z)
Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation [5.928213664340974]
This research addresses the safety issue in the control optimization problem of autonomous driving. We propose a novel, model-based approach for policy optimization, utilizing a conditional Value-at-Risk based Soft Actor Critic. Our method introduces a worst-case actor to guide safe exploration, ensuring rigorous adherence to safety requirements even in unpredictable scenarios.
arXiv Detail & Related papers (2024-07-08T18:32:40Z)
Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL) We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z)
Hybrid Convolutional Neural Networks with Reliability Guarantee [0.0]
We propose redundant execution as a well-known technique that can be used to ensure reliable execution of the AI model. This generic technique will extend the application scope of AI-accelerators that do not feature well-documented safety or dependability properties.
arXiv Detail & Related papers (2024-05-08T15:39:38Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories and Averting Negative Side Effects [27.41101006357176]
In safe MDP planning, a cost function based on the current state and action is often used to specify safety aspects. operating based on an incomplete model can often produce unintended negative side effects (NSEs)
arXiv Detail & Related papers (2023-04-06T14:03:24Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework [30.58186749790728]
We propose a theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty. Our framework guarantees the high-probability satisfaction of all constraints at all times jointly. This theoretical analysis also motivates two regularizers of last-layer meta-learning models that improve online adaptation capabilities.
arXiv Detail & Related papers (2020-08-26T17:39:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.