Measuring the Complexity of Domains Used to Evaluate AI Systems
- URL: http://arxiv.org/abs/2010.01985v1
- Date: Fri, 18 Sep 2020 21:53:07 GMT
- Title: Measuring the Complexity of Domains Used to Evaluate AI Systems
- Authors: Christopher Pereyda, Lawrence Holder
- Abstract summary: We propose a theory for measuring the complexity between varied domains.
An application of this measure is then demonstrated to show its effectiveness as a tool in varied situations.
We propose the future use of such a complexity metric for use in computing an AI system's intelligence.
- Score: 0.48951183832371004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is currently a rapid increase in the number of challenge problem,
benchmarking datasets and algorithmic optimization tests for evaluating AI
systems. However, there does not currently exist an objective measure to
determine the complexity between these newly created domains. This lack of
cross-domain examination creates an obstacle to effectively research more
general AI systems. We propose a theory for measuring the complexity between
varied domains. This theory is then evaluated using approximations by a
population of neural network based AI systems. The approximations are compared
to other well known standards and show it meets intuitions of complexity. An
application of this measure is then demonstrated to show its effectiveness as a
tool in varied situations. The experimental results show this measure has
promise as an effective tool for aiding in the evaluation of AI systems. We
propose the future use of such a complexity metric for use in computing an AI
system's intelligence.
Related papers
- Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design [63.24275274981911]
Compound AI Systems consisting of many language model inference calls are increasingly employed.
In this work, we construct systems, which we call Networks of Networks (NoNs) organized around the distinction between generating a proposed answer and verifying its correctness.
We introduce a verifier-based judge NoN with K generators, an instantiation of "best-of-K" or "judge-based" compound AI systems.
arXiv Detail & Related papers (2024-07-23T20:40:37Z) - Adaptation of XAI to Auto-tuning for Numerical Libraries [0.0]
Explainable AI (XAI) technology is gaining prominence, aiming to streamline AI model development and alleviate the burden of explaining AI outputs to users.
This research focuses on XAI for AI models when integrated into two different processes for practical numerical computations.
arXiv Detail & Related papers (2024-05-12T09:00:56Z) - Random Aggregate Beamforming for Over-the-Air Federated Learning in Large-Scale Networks [66.18765335695414]
We consider a joint device selection and aggregate beamforming design with the objectives of minimizing the aggregate error and maximizing the number of selected devices.
To tackle the problems in a cost-effective manner, we propose a random aggregate beamforming-based scheme.
We additionally use analysis to study the obtained aggregate error and the number of the selected devices when the number of devices becomes large.
arXiv Detail & Related papers (2024-02-20T23:59:45Z) - An Exploratory Study of AI System Risk Assessment from the Lens of Data
Distribution and Uncertainty [4.99372598361924]
Deep learning (DL) has become a driving force and has been widely adopted in many domains and applications.
This paper initiates an early exploratory study of AI system risk assessment from both the data distribution and uncertainty angles.
arXiv Detail & Related papers (2022-12-13T03:34:25Z) - SAIH: A Scalable Evaluation Methodology for Understanding AI Performance
Trend on HPC Systems [18.699431277588637]
We propose a scalable evaluation methodology (SAIH) for analyzing the AI performance trend of HPC systems.
As the data and model constantly scale, we can investigate the trend and range of AI performance on HPC systems.
arXiv Detail & Related papers (2022-12-07T02:42:29Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - Robustness testing of AI systems: A case study for traffic sign
recognition [13.395753930904108]
This paper presents how the robustness of AI systems can be practically examined and which methods and metrics can be used to do so.
The robustness testing methodology is described and analysed for the example use case of traffic sign recognition in autonomous driving.
arXiv Detail & Related papers (2021-08-13T10:29:09Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - High-dimensional separability for one- and few-shot learning [58.8599521537]
This work is driven by a practical question, corrections of Artificial Intelligence (AI) errors.
Special external devices, correctors, are developed. They should provide quick and non-iterative system fix without modification of a legacy AI system.
New multi-correctors of AI systems are presented and illustrated with examples of predicting errors and learning new classes of objects by a deep convolutional neural network.
arXiv Detail & Related papers (2021-06-28T14:58:14Z) - The Threats of Artificial Intelligence Scale (TAI). Development,
Measurement and Test Over Three Application Domains [0.0]
Several opinion polls frequently query the public fear of autonomous robots and artificial intelligence (FARAI)
We propose a fine-grained scale to measure threat perceptions of AI that accounts for four functional classes of AI systems and is applicable to various domains of AI applications.
The data support the dimensional structure of the proposed Threats of AI (TAI) scale as well as the internal consistency and factoral validity of the indicators.
arXiv Detail & Related papers (2020-06-12T14:15:02Z) - Bias in Multimodal AI: Testbed for Fair Automatic Recruitment [73.85525896663371]
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
We train automatic recruitment algorithms using a set of multimodal synthetic profiles consciously scored with gender and racial biases.
Our methodology and results show how to generate fairer AI-based tools in general, and in particular fairer automated recruitment systems.
arXiv Detail & Related papers (2020-04-15T15:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.