On the relationship between Benchmarking, Standards and Certification in
Robotics and AI
- URL: http://arxiv.org/abs/2309.12139v1
- Date: Thu, 21 Sep 2023 14:59:36 GMT
- Title: On the relationship between Benchmarking, Standards and Certification in
Robotics and AI
- Authors: Alan F.T. Winfield and Matthew Studley
- Abstract summary: Benchmarking, standards and certification are closely related processes.
Benchmarking, standards and certification are not only useful but vital to the broader practice of Responsible Innovation.
- Score: 1.1421942894219899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Benchmarking, standards and certification are closely related processes.
Standards can provide normative requirements that robotics and AI systems may
or may not conform to. Certification generally relies upon conformance with one
or more standards as the key determinant of granting a certificate to operate.
And benchmarks are sets of standardised tests against which robots and AI
systems can be measured. Benchmarks therefore can be thought of as informal
standards. In this paper we will develop these themes with examples from
benchmarking, standards and certification, and argue that these three linked
processes are not only useful but vital to the broader practice of Responsible
Innovation.
Related papers
- SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI [47.11178028457252]
We develop SecCodePLT, a unified and comprehensive evaluation platform for code GenAIs' risks.
For insecure code, we introduce a new methodology for data creation that combines experts with automatic generation.
For cyberattack helpfulness, we construct samples to prompt a model to generate actual attacks, along with dynamic metrics in our environment.
arXiv Detail & Related papers (2024-10-14T21:17:22Z) - Ethical and Scalable Automation: A Governance and Compliance Framework for Business Applications [0.0]
This paper introduces a framework ensuring that AI must be ethical, controllable, viable, and desirable.
Different case studies validate this framework by integrating AI in both academic and practical environments.
arXiv Detail & Related papers (2024-09-25T12:39:28Z) - An Open Knowledge Graph-Based Approach for Mapping Concepts and Requirements between the EU AI Act and International Standards [1.9142148274342772]
The EU's AI Act will shift the focus of such organizations toward conformance with the technical requirements for regulatory compliance.
This paper offers a simple and repeatable mechanism for mapping the terms and requirements relevant to normative statements in regulations and standards.
arXiv Detail & Related papers (2024-08-21T18:21:09Z) - Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design [63.24275274981911]
Compound AI Systems consisting of many language model inference calls are increasingly employed.
In this work, we construct systems, which we call Networks of Networks (NoNs) organized around the distinction between generating a proposed answer and verifying its correctness.
We introduce a verifier-based judge NoN with K generators, an instantiation of "best-of-K" or "judge-based" compound AI systems.
arXiv Detail & Related papers (2024-07-23T20:40:37Z) - Benchmarks as Microscopes: A Call for Model Metrology [76.64402390208576]
Modern language models (LMs) pose a new challenge in capability assessment.
To be confident in our metrics, we need a new discipline of model metrology.
arXiv Detail & Related papers (2024-07-22T17:52:12Z) - ECBD: Evidence-Centered Benchmark Design for NLP [95.50252564938417]
We propose Evidence-Centered Benchmark Design (ECBD), a framework which formalizes the benchmark design process into five modules.
Each module requires benchmark designers to describe, justify, and support benchmark design choices.
Our analysis reveals common trends in benchmark design and documentation that could threaten the validity of benchmarks' measurements.
arXiv Detail & Related papers (2024-06-13T00:59:55Z) - Towards Standards-Compliant Assistive Technology Product Specifications via LLMs [7.30389619012625]
We introduce CompliAT, a pioneering framework designed to streamline the compliance process of AT product specifications.
CompliAT addresses three critical tasks: checking consistency terminology, classifying products according to standards, and tracing key product specifications to standard requirements.
We propose a novel approach for product classification, leveraging a retrieval-augmented generation model to accurately categorize AT products aligning to international standards.
arXiv Detail & Related papers (2024-04-04T00:10:39Z) - No Trust without regulation! [0.0]
The explosion in performance of Machine Learning (ML) and the potential of its applications are encouraging us to consider its use in industrial systems.
It is still leaving too much to one side the issue of safety and its corollary, regulation and standards.
The European Commission has laid the foundations for moving forward and building solid approaches to the integration of AI-based applications that are safe, trustworthy and respect European ethical values.
arXiv Detail & Related papers (2023-09-27T09:08:41Z) - A General Framework for Verification and Control of Dynamical Models via Certificate Synthesis [54.959571890098786]
We provide a framework to encode system specifications and define corresponding certificates.
We present an automated approach to formally synthesise controllers and certificates.
Our approach contributes to the broad field of safe learning for control, exploiting the flexibility of neural networks.
arXiv Detail & Related papers (2023-09-12T09:37:26Z) - Towards a multi-stakeholder value-based assessment framework for
algorithmic systems [76.79703106646967]
We develop a value-based assessment framework that visualizes closeness and tensions between values.
We give guidelines on how to operationalize them, while opening up the evaluation and deliberation process to a wide range of stakeholders.
arXiv Detail & Related papers (2022-05-09T19:28:32Z) - A Norm Emergence Framework for Normative MAS -- Position Paper [0.90238471756546]
We propose a framework for the emergence of norms within a normative multiagent system.
We make the case that, similarly, a norm has emerged in a normative MAS when a percentage of agents adopt the norm.
We put forward a framework for the emergence of norms within a normative MAS, while special-purpose synthesizer agents formulate new norms or revisions in response to these requests.
arXiv Detail & Related papers (2020-04-06T11:42:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.