Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR
- URL: http://arxiv.org/abs/2410.03536v1
- Date: Sat, 14 Sep 2024 23:33:28 GMT
- Title: Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR
- Authors: Jing Shu, Bing-Jiun Miu, Eugene Chang, Jerry Gao, Jun Liu,
- Abstract summary: We first present a comprehensive literature review of previous work, covering key facets of AI software testing processes.
We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function.
To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects.
- Score: 3.0561992956541606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI-based systems possess distinctive characteristics and introduce challenges in quality evaluation at the same time. Consequently, ensuring and validating AI software quality is of critical importance. In this paper, we present an effective AI software functional testing model to address this challenge. Specifically, we first present a comprehensive literature review of previous work, covering key facets of AI software testing processes. We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function, as well as test coverage criteria and complexity. To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects. Finally, based on the proposed framework and defined metrics, a mobile Optical Character Recognition (OCR) case study is presented to demonstrate the framework's effectiveness and capability in assessing AI function quality.
Related papers
- The Role of Artificial Intelligence and Machine Learning in Software Testing [0.14896196009851972]
Artificial Intelligence (AI) and Machine Learning (ML) have significantly impacted various industries.
Software testing, a crucial part of the software development lifecycle (SDLC), ensures the quality and reliability of software products.
This paper explores the role of AI and ML in software testing by reviewing existing literature, analyzing current tools and techniques, and presenting case studies.
arXiv Detail & Related papers (2024-09-04T13:25:13Z) - How critically can an AI think? A framework for evaluating the quality of thinking of generative artificial intelligence [0.9671462473115854]
Generative AI such as those with large language models have created opportunities for innovative assessment design practices.
This paper presents a framework that explores the capabilities of the LLM ChatGPT4 application, which is the current industry benchmark.
This critique will provide specific and targeted indications of their questions vulnerabilities in terms of the critical thinking skills.
arXiv Detail & Related papers (2024-06-20T22:46:56Z) - Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning [58.41087653543607]
We first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+.
This paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning.
arXiv Detail & Related papers (2024-05-12T17:45:11Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition [4.125007507808684]
This study proposes an effective AIGC quality assessment (QA) framework.
First, we propose a hybrid prompt encoding method based on a dual-source CLIP (Contrastive Language-Image Pre-Training) text encoder.
Second, we propose an ensemble-based feature mixer module to effectively blend the adapted prompt and vision features.
arXiv Detail & Related papers (2024-04-20T07:05:45Z) - AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024.
A subjective IQA experiment is conducted to assess human visual preferences from three perspectives.
We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z) - From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing.
This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time.
We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z) - Do-AIQ: A Design-of-Experiment Approach to Quality Evaluation of AI
Mislabel Detection Algorithm [0.0]
The quality of Artificial Intelligence (AI) algorithms is of significant importance for confidently adopting algorithms in various applications such as cybersecurity, healthcare, and autonomous driving.
This work presents a principled framework of using a design-of-experimental approach to systematically evaluate the quality of AI algorithms, named as Do-AIQ.
arXiv Detail & Related papers (2022-08-21T19:47:41Z) - Image Quality Assessment in the Modern Age [53.19271326110551]
This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA)
We will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli.
Both hand-engineered and (deep) learning-based methods will be covered.
arXiv Detail & Related papers (2021-10-19T02:38:46Z) - No-Reference Image Quality Assessment via Feature Fusion and Multi-Task
Learning [29.19484863898778]
Blind or no-reference image quality assessment (NR-IQA) is a fundamental, unsolved, and yet challenging problem.
We propose a simple and yet effective general-purpose no-reference (NR) image quality assessment framework based on multi-task learning.
Our model employs distortion types as well as subjective human scores to predict image quality.
arXiv Detail & Related papers (2020-06-06T05:04:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.