Rethinking Certification for Trustworthy Machine Learning-Based
  Applications
        - URL: http://arxiv.org/abs/2305.16822v4
- Date: Sun, 22 Oct 2023 19:31:17 GMT
- Title: Rethinking Certification for Trustworthy Machine Learning-Based
  Applications
- Authors: Marco Anisetti and Claudio A. Ardagna and Nicola Bena and Ernesto
  Damiani
- Abstract summary: Machine Learning (ML) is increasingly used to implement advanced applications with non-deterministic behavior.
Existing certification schemes are not immediately applicable to non-deterministic applications built on ML models.
This article analyzes the challenges and deficiencies of current certification schemes, discusses open research issues, and proposes a first certification scheme for ML-based applications.
- Score: 3.886429361348165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Machine Learning (ML) is increasingly used to implement advanced applications
with non-deterministic behavior, which operate on the cloud-edge continuum. The
pervasive adoption of ML is urgently calling for assurance solutions assessing
applications non-functional properties (e.g., fairness, robustness, privacy)
with the aim to improve their trustworthiness. Certification has been clearly
identified by policymakers, regulators, and industrial stakeholders as the
preferred assurance technique to address this pressing need. Unfortunately,
existing certification schemes are not immediately applicable to
non-deterministic applications built on ML models. This article analyzes the
challenges and deficiencies of current certification schemes, discusses open
research issues, and proposes a first certification scheme for ML-based
applications.
 
      
        Related papers
        - Position: Certified Robustness Does Not (Yet) Imply Model Security [29.595213559303996]
 certified robustness is promoted as a solution to adversarial examples in Artificial Intelligence systems.<n>We identify critical gaps in current research, including the paradox of detection without distinction.<n>We propose steps to address these fundamental challenges and advance the field toward practical applicability.
 arXiv  Detail & Related papers  (2025-06-16T01:18:33Z)
- LLM Agents Should Employ Security Principles [60.03651084139836]
 This paper argues that the well-established design principles in information security should be employed when deploying Large Language Model (LLM) agents at scale.<n>We introduce AgentSandbox, a conceptual framework embedding these security principles to provide safeguards throughout an agent's life-cycle.
 arXiv  Detail & Related papers  (2025-05-29T21:39:08Z)
- Engineering Trustworthy Machine-Learning Operations with Zero-Knowledge   Proofs [1.7723990552388873]
 Zero-Knowledge Proofs (ZKPs) offer a cryptographic solution that enables provers to demonstrate, through verified computations, adherence to set requirements without revealing sensitive model details or data.<n>We identify five key properties (non-interactivity, transparent setup, standard representations, succinctness, and post-quantum security) critical for their application in AI validation and verification pipelines.
 arXiv  Detail & Related papers  (2025-05-26T15:39:11Z)
- Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling [48.15636223774418]
 Large language models (LLMs) frequently hallucinate due to misaligned self-awareness.
Existing approaches mitigate hallucinations via uncertainty estimation or query rejection.
We propose the Explicit Knowledge Boundary Modeling framework to integrate fast and slow reasoning systems.
 arXiv  Detail & Related papers  (2025-03-04T03:16:02Z)
- Approach Towards Semi-Automated Certification for Low Criticality   ML-Enabled Airborne Applications [0.0]
 This paper proposes a semi automated certification approach, specifically for low criticality ML systems.
Key aspects include structured classification to guide certification rigor on system attributes, an Assurance Profile that consolidates evaluation outcomes into a confidence measure the ML component, and methodologies for integrating human oversight into certification activities.
 arXiv  Detail & Related papers  (2025-01-28T15:49:51Z)
- Powering LLM Regulation through Data: Bridging the Gap from Compute   Thresholds to Customer Experiences [0.0]
 This paper argues that current regulatory approaches, which focus on compute-level thresholds and generalized model evaluations, are insufficient to ensure the safety and effectiveness of specific LLM-based user experiences.
We propose a shift towards a certification process centered on actual user-facing experiences and the curation of high-quality datasets for evaluation.
 arXiv  Detail & Related papers  (2025-01-12T16:20:40Z)
- AutoPT: How Far Are We from the End2End Automated Web Penetration   Testing? [54.65079443902714]
 We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
 arXiv  Detail & Related papers  (2024-11-02T13:24:30Z)
- SafeBench: A Safety Evaluation Framework for Multimodal Large Language   Models [75.67623347512368]
 We propose toolns, a comprehensive framework designed for conducting safety evaluations of MLLMs.
Our framework consists of a comprehensive harmful query dataset and an automated evaluation protocol.
Based on our framework, we conducted large-scale experiments on 15 widely-used open-source MLLMs and 6 commercial MLLMs.
 arXiv  Detail & Related papers  (2024-10-24T17:14:40Z)
- Simulation-based Safety Assurance for an AVP System incorporating
  Learning-Enabled Components [0.6526824510982802]
 Testing, verification and validation AD/ADAS safety-critical applications remain as one the main challenges.
We explain the simulation-based development platform that is designed to verify and validate safety-critical learning-enabled systems.
 arXiv  Detail & Related papers  (2023-09-28T09:00:31Z)
- MLGuard: Defend Your Machine Learning Model! [3.4069804433026314]
 We propose MLGuard, a new approach to specify contracts for Machine Learning applications.
Our work is intended to provide the overarching framework required for building ML applications and monitoring their safety.
 arXiv  Detail & Related papers  (2023-09-04T06:08:11Z)
- Vulnerability of Machine Learning Approaches Applied in IoT-based Smart   Grid: A Review [51.31851488650698]
 Machine learning (ML) sees an increasing prevalence of being used in the internet-of-things (IoT)-based smart grid.
 adversarial distortion injected into the power signal will greatly affect the system's normal control and operation.
It is imperative to conduct vulnerability assessment for MLsgAPPs applied in the context of safety-critical power systems.
 arXiv  Detail & Related papers  (2023-08-30T03:29:26Z)
- AdvCat: Domain-Agnostic Robustness Assessment for Cybersecurity-Critical
  Applications with Categorical Inputs [29.907921481157974]
 robustness against adversarial attacks is one of the key trust concerns for Machine Learning deployment.
We propose a provably optimal yet highly efficient adversarial robustness assessment protocol for a wide band of ML-driven cybersecurity-critical applications.
We demonstrate the use of the domain-agnostic robustness assessment method with substantial experimental study on fake news detection and intrusion detection problems.
 arXiv  Detail & Related papers  (2022-12-13T18:12:02Z)
- Toward Certification of Machine-Learning Systems for Low Criticality
  Airborne Applications [0.0]
 Possible airborne applications of machine learning (ML) include safety-critical functions.
Current certification standards for the aviation industry were developed prior to the ML renaissance.
There are some fundamental incompatibilities between traditional design assurance approaches and certain aspects of ML-based systems.
 arXiv  Detail & Related papers  (2022-09-28T10:13:28Z)
- Joint Differentiable Optimization and Verification for Certified
  Reinforcement Learning [91.93635157885055]
 In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties.
We propose a framework that jointly conducts reinforcement learning and formal verification.
 arXiv  Detail & Related papers  (2022-01-28T16:53:56Z)
- Trusted Artificial Intelligence: Towards Certification of Machine
  Learning Applications [5.7576910363986]
 The T"UV AUSTRIA Group in cooperation with the Institute for Machine Learning at the Johannes Kepler University Linz proposes a certification process.
The holistic approach attempts to evaluate and verify the aspects of secure software development, functional requirements, data quality, data protection, and ethics.
The audit catalog can be applied to low-risk applications within the scope of supervised learning.
 arXiv  Detail & Related papers  (2021-03-31T08:59:55Z)
- White Paper Machine Learning in Certified Systems [70.24215483154184]
 DEEL Project set-up the ML Certification 3 Workgroup (WG) set-up by the Institut de Recherche Technologique Saint Exup'ery de Toulouse (IRT)
 arXiv  Detail & Related papers  (2021-03-18T21:14:30Z)
- Explanations of Machine Learning predictions: a mandatory step for its
  application to Operational Processes [61.20223338508952]
 Credit Risk Modelling plays a paramount role.
Recent machine and deep learning techniques have been applied to the task.
We suggest to use LIME technique to tackle the explainability problem in this field.
 arXiv  Detail & Related papers  (2020-12-30T10:27:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.