Related papers: Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR

Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR

URL: http://arxiv.org/abs/2410.03536v1
Date: Sat, 14 Sep 2024 23:33:28 GMT
Title: Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR
Authors: Jing Shu, Bing-Jiun Miu, Eugene Chang, Jerry Gao, Jun Liu,
Abstract summary: We first present a comprehensive literature review of previous work, covering key facets of AI software testing processes. We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function. To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects.
Score: 3.0561992956541606
License: http://creativecommons.org/licenses/by/4.0/
Abstract: AI-based systems possess distinctive characteristics and introduce challenges in quality evaluation at the same time. Consequently, ensuring and validating AI software quality is of critical importance. In this paper, we present an effective AI software functional testing model to address this challenge. Specifically, we first present a comprehensive literature review of previous work, covering key facets of AI software testing processes. We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function, as well as test coverage criteria and complexity. To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects. Finally, based on the proposed framework and defined metrics, a mobile Optical Character Recognition (OCR) case study is presented to demonstrate the framework's effectiveness and capability in assessing AI function quality.

Related papers

AI-Driven Tools in Modern Software Quality Assurance: An Assessment of Benefits, Challenges, and Future Directions [0.0]
The research aims to assess the benefits, challenges, and prospects of integrating modern AI-oriented tools into quality assurance processes.<n>The research demonstrates AI's transformative potential for QA but highlights the importance of a strategic approach to implementing these technologies.
arXiv Detail & Related papers (2025-06-19T20:22:47Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
AI-powered Contextual 3D Environment Generation: A Systematic Review [49.1574468325115]
This study performs a systematic review of existing generative AI techniques for 3D scene generation.<n>By examining state-of-the-art approaches, it presents key challenges such as scene authenticity and the influence of textual inputs.
arXiv Detail & Related papers (2025-06-05T15:56:28Z)
Perceptual Quality Assessment for Embodied AI [66.96928199019129]
Embodied AI has developed rapidly in recent years, but it is still mainly deployed in laboratories.<n>There is no IQA method to assess the usability of an image in embodied tasks, namely, the perceptual quality for robots.
arXiv Detail & Related papers (2025-05-22T15:51:07Z)
Requirements-Driven Automated Software Testing: A Systematic Review [13.67495800498868]
This study synthesizes the current state of REDAST research, highlights trends, and proposes future directions. This systematic literature review ( SLR) explores the landscape of REDAST by analyzing requirements input, transformation techniques, test outcomes, evaluation methods, and existing limitations.
arXiv Detail & Related papers (2025-02-25T23:13:09Z)
Work in Progress: AI-Powered Engineering-Bridging Theory and Practice [0.0]
This paper explores how generative AI can help automate and improve key steps in systems engineering. It examines AI's ability to analyze system requirements based on INCOSE's "good requirement" criteria. The research aims to assess AI's potential to streamline engineering processes and improve learning outcomes.
arXiv Detail & Related papers (2025-02-06T17:42:00Z)
AI-generated Image Quality Assessment in Visual Communication [72.11144790293086]
AIGI-VC is a quality assessment database for AI-generated images in visual communication. The dataset consists of 2,500 images spanning 14 advertisement topics and 8 emotion types. It provides coarse-grained human preference annotations and fine-grained preference descriptions, benchmarking the abilities of IQA methods in preference prediction, interpretation, and reasoning.
arXiv Detail & Related papers (2024-12-20T08:47:07Z)
AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity [62.00987205438436]
We propose a novel quality assessment method for AIGIs named TSP-MGS. It designs task-specific prompts and measures multi-granularity similarity between AIGIs and the prompts. Experiments on the commonly used AGIQA-1K and AGIQA-3K benchmarks demonstrate the superiority of the proposed TSP-MGS.
arXiv Detail & Related papers (2024-11-25T04:47:53Z)
The Role of Artificial Intelligence and Machine Learning in Software Testing [0.14896196009851972]
Artificial Intelligence (AI) and Machine Learning (ML) have significantly impacted various industries. Software testing, a crucial part of the software development lifecycle (SDLC), ensures the quality and reliability of software products. This paper explores the role of AI and ML in software testing by reviewing existing literature, analyzing current tools and techniques, and presenting case studies.
arXiv Detail & Related papers (2024-09-04T13:25:13Z)
How critically can an AI think? A framework for evaluating the quality of thinking of generative artificial intelligence [0.9671462473115854]
Generative AI such as those with large language models have created opportunities for innovative assessment design practices. This paper presents a framework that explores the capabilities of the LLM ChatGPT4 application, which is the current industry benchmark. This critique will provide specific and targeted indications of their questions vulnerabilities in terms of the critical thinking skills.
arXiv Detail & Related papers (2024-06-20T22:46:56Z)
Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning [58.41087653543607]
We first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+. This paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning.
arXiv Detail & Related papers (2024-05-12T17:45:11Z)
PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition [4.125007507808684]
This study proposes an effective AIGC quality assessment (QA) framework. First, we propose a hybrid prompt encoding method based on a dual-source CLIP (Contrastive Language-Image Pre-Training) text encoder. Second, we propose an ensemble-based feature mixer module to effectively blend the adapted prompt and vision features.
arXiv Detail & Related papers (2024-04-20T07:05:45Z)
AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024. A subjective IQA experiment is conducted to assess human visual preferences from three perspectives. We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z)
From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing. This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time. We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information. We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z)
Do-AIQ: A Design-of-Experiment Approach to Quality Evaluation of AI Mislabel Detection Algorithm [0.0]
The quality of Artificial Intelligence (AI) algorithms is of significant importance for confidently adopting algorithms in various applications such as cybersecurity, healthcare, and autonomous driving. This work presents a principled framework of using a design-of-experimental approach to systematically evaluate the quality of AI algorithms, named as Do-AIQ.
arXiv Detail & Related papers (2022-08-21T19:47:41Z)
Image Quality Assessment in the Modern Age [53.19271326110551]
This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA) We will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli. Both hand-engineered and (deep) learning-based methods will be covered.
arXiv Detail & Related papers (2021-10-19T02:38:46Z)
No-Reference Image Quality Assessment via Feature Fusion and Multi-Task Learning [29.19484863898778]
Blind or no-reference image quality assessment (NR-IQA) is a fundamental, unsolved, and yet challenging problem. We propose a simple and yet effective general-purpose no-reference (NR) image quality assessment framework based on multi-task learning. Our model employs distortion types as well as subjective human scores to predict image quality.
arXiv Detail & Related papers (2020-06-06T05:04:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.