Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR
- URL: http://arxiv.org/abs/2410.03536v1
- Date: Sat, 14 Sep 2024 23:33:28 GMT
- Title: Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR
- Authors: Jing Shu, Bing-Jiun Miu, Eugene Chang, Jerry Gao, Jun Liu,
- Abstract summary: We first present a comprehensive literature review of previous work, covering key facets of AI software testing processes.
We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function.
To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects.
- Score: 3.0561992956541606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI-based systems possess distinctive characteristics and introduce challenges in quality evaluation at the same time. Consequently, ensuring and validating AI software quality is of critical importance. In this paper, we present an effective AI software functional testing model to address this challenge. Specifically, we first present a comprehensive literature review of previous work, covering key facets of AI software testing processes. We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function, as well as test coverage criteria and complexity. To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects. Finally, based on the proposed framework and defined metrics, a mobile Optical Character Recognition (OCR) case study is presented to demonstrate the framework's effectiveness and capability in assessing AI function quality.
Related papers
- Work in Progress: AI-Powered Engineering-Bridging Theory and Practice [0.0]
This paper explores how generative AI can help automate and improve key steps in systems engineering.
It examines AI's ability to analyze system requirements based on INCOSE's "good requirement" criteria.
The research aims to assess AI's potential to streamline engineering processes and improve learning outcomes.
arXiv Detail & Related papers (2025-02-06T17:42:00Z) - AI-generated Image Quality Assessment in Visual Communication [72.11144790293086]
AIGI-VC is a quality assessment database for AI-generated images in visual communication.
The dataset consists of 2,500 images spanning 14 advertisement topics and 8 emotion types.
It provides coarse-grained human preference annotations and fine-grained preference descriptions, benchmarking the abilities of IQA methods in preference prediction, interpretation, and reasoning.
arXiv Detail & Related papers (2024-12-20T08:47:07Z) - AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity [62.00987205438436]
We propose a novel quality assessment method for AIGIs named TSP-MGS.
It designs task-specific prompts and measures multi-granularity similarity between AIGIs and the prompts.
Experiments on the commonly used AGIQA-1K and AGIQA-3K benchmarks demonstrate the superiority of the proposed TSP-MGS.
arXiv Detail & Related papers (2024-11-25T04:47:53Z) - The Role of Artificial Intelligence and Machine Learning in Software Testing [0.14896196009851972]
Artificial Intelligence (AI) and Machine Learning (ML) have significantly impacted various industries.
Software testing, a crucial part of the software development lifecycle (SDLC), ensures the quality and reliability of software products.
This paper explores the role of AI and ML in software testing by reviewing existing literature, analyzing current tools and techniques, and presenting case studies.
arXiv Detail & Related papers (2024-09-04T13:25:13Z) - Quality Assessment for AI Generated Images with Instruction Tuning [58.41087653543607]
We first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+.
This paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning.
arXiv Detail & Related papers (2024-05-12T17:45:11Z) - AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024.
A subjective IQA experiment is conducted to assess human visual preferences from three perspectives.
We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z) - From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing.
This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time.
We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z) - Do-AIQ: A Design-of-Experiment Approach to Quality Evaluation of AI
Mislabel Detection Algorithm [0.0]
The quality of Artificial Intelligence (AI) algorithms is of significant importance for confidently adopting algorithms in various applications such as cybersecurity, healthcare, and autonomous driving.
This work presents a principled framework of using a design-of-experimental approach to systematically evaluate the quality of AI algorithms, named as Do-AIQ.
arXiv Detail & Related papers (2022-08-21T19:47:41Z) - No-Reference Image Quality Assessment via Feature Fusion and Multi-Task
Learning [29.19484863898778]
Blind or no-reference image quality assessment (NR-IQA) is a fundamental, unsolved, and yet challenging problem.
We propose a simple and yet effective general-purpose no-reference (NR) image quality assessment framework based on multi-task learning.
Our model employs distortion types as well as subjective human scores to predict image quality.
arXiv Detail & Related papers (2020-06-06T05:04:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.