Related papers: GenLens: A Systematic Evaluation of Visual GenAI Model Outputs

GenLens: A Systematic Evaluation of Visual GenAI Model Outputs

URL: http://arxiv.org/abs/2402.03700v1
Date: Tue, 6 Feb 2024 04:41:06 GMT
Title: GenLens: A Systematic Evaluation of Visual GenAI Model Outputs
Authors: Tica Lin, Hanspeter Pfister, Jui-Hsien Wang
Abstract summary: GenLens is a visual analytic interface designed for the systematic evaluation of GenAI model outputs. A user study with model developers reveals that GenLens effectively enhances their workflow, evidenced by high satisfaction rates.
Score: 33.93591473459988
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The rapid development of generative AI (GenAI) models in computer vision necessitates effective evaluation methods to ensure their quality and fairness. Existing tools primarily focus on dataset quality assurance and model explainability, leaving a significant gap in GenAI output evaluation during model development. Current practices often depend on developers' subjective visual assessments, which may lack scalability and generalizability. This paper bridges this gap by conducting a formative study with GenAI model developers in an industrial setting. Our findings led to the development of GenLens, a visual analytic interface designed for the systematic evaluation of GenAI model outputs during the early stages of model development. GenLens offers a quantifiable approach for overviewing and annotating failure cases, customizing issue tags and classifications, and aggregating annotations from multiple users to enhance collaboration. A user study with model developers reveals that GenLens effectively enhances their workflow, evidenced by high satisfaction rates and a strong intent to integrate it into their practices. This research underscores the importance of robust early-stage evaluation tools in GenAI development, contributing to the advancement of fair and high-quality GenAI models.

Related papers

Survey of GenAI for Automotive Software Development: From Requirements to Executable Code [4.909409341455637]
Automotive software development is considered to be a significant area for GenAI adoption.<n>Three GenAI-related technologies are covered within the state-of-art: Large Language Models (LLMs), Retrieval Augmented Generation (RAG), Vision Language Models (VLMs)
arXiv Detail & Related papers (2025-07-20T16:21:51Z)
The Impact of Generative AI on Code Expertise Models: An Exploratory Study [0.0]
We present an exploratory analysis of how a knowledge model and a Truck Factor algorithm can be affected by GenAI usage.<n>Our findings suggest that as GenAI becomes more integrated into development, the reliability of such metrics may decrease.
arXiv Detail & Related papers (2025-07-10T20:43:08Z)
Encouraging Students' Responsible Use of GenAI in Software Engineering Education: A Causal Model and Two Institutional Applications [1.1511012020557325]
generative AI (GenAI) tools such as ChatGPT and GitHub Copilot become pervasive in education.<n>Concerns are rising about students using them to complete rather than learn from coursework.<n>This paper proposes and empirically applies a causal model to help educators scaffold responsible GenAI use in Software Engineering education.
arXiv Detail & Related papers (2025-05-31T19:27:40Z)
Generative AI for Software Architecture. Applications, Trends, Challenges, and Future Directions [6.883775050854466]
We aim to systematically synthesize the use, rationale, contexts, usability, and future challenges of GenAI in software architecture. Our review identified significant adoption of GenAI for architectural decision support and architectural reconstruction.
arXiv Detail & Related papers (2025-03-17T15:49:30Z)
Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI. We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts. We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z)
Benchmarking Generative AI Models for Deep Learning Test Input Generation [6.674615464230326]
Test Input Generators (TIGs) are crucial to assess the ability of Deep Learning (DL) image classifiers to provide correct predictions for inputs beyond their training and test sets. Recent advancements in Generative AI (GenAI) models have made them a powerful tool for creating and manipulating synthetic images. We benchmark and combine different GenAI models with TIGs, assessing their effectiveness, efficiency, and quality of the generated test images.
arXiv Detail & Related papers (2024-12-23T15:30:42Z)
GenAIOps for GenAI Model-Agility [2.7396907658239424]
We discuss so-called GenAI Model-agility, which we define as the readiness to be flexibly adapted to base foundation models as diverse as the model providers and versions. First, for handling issues specific to generative AI, we first define a methodology of GenAI application development and operations, as GenAIOps, to identify the problem of application quality degradation caused by changes to the underlying foundation models. We study prompt tuning technologies, which look promising to address this problem, and discuss their effectiveness and limitations through case studies using existing tools.
arXiv Detail & Related papers (2024-12-19T03:29:03Z)
Dimensions of Generative AI Evaluation Design [51.541816010127256]
We propose a set of general dimensions that capture critical choices involved in GenAI evaluation design. These dimensions include the evaluation setting, the task type, the input source, the interaction style, the duration, the metric type, and the scoring method.
arXiv Detail & Related papers (2024-11-19T18:25:30Z)
Exploring Gen-AI applications in building research and industry: A review [10.154329382433213]
This paper investigates the transformative potential of Generative AI (Gen-AI) technologies within the building industry.<n>By leveraging these advanced AI tools, the study explores their application across key areas such as automated compliance checking and building design assistance.<n>The paper concludes with a comprehensive analysis of the current capabilities of Gen-AI in the building industry.
arXiv Detail & Related papers (2024-10-01T21:59:08Z)
Recommendation with Generative Models [35.029116616023586]
Generative models are AI models capable of creating new instances of data by learning and sampling from their statistical distributions. These models have applications across various domains, such as image generation, text synthesis, and music composition. In recommender systems, generative models, referred to as Gen-RecSys, improve the accuracy and diversity of recommendations.
arXiv Detail & Related papers (2024-09-18T18:29:15Z)
Case Study: Leveraging GenAI to Build AI-based Surrogates and Regressors for Modeling Radio Frequency Heating in Fusion Energy Science [30.658306142871602]
This work presents a detailed case study on using Generative AI (GenAI) to develop AI surrogates for simulation models in fusion energy research. The scope includes the methodology, implementation, and results of using GenAI to assist in model development and optimization.
arXiv Detail & Related papers (2024-09-10T00:22:19Z)
On the Limitations and Prospects of Machine Unlearning for Generative AI [7.795648142175443]
Generative AI (GenAI) aims to synthesize realistic and diverse data samples from latent variables or other data modalities. GenAI has achieved remarkable results in various domains, such as natural language, images, audio, and graphs. However, they also pose challenges and risks to data privacy, security, and ethics.
arXiv Detail & Related papers (2024-08-01T08:35:40Z)
Model-based Maintenance and Evolution with GenAI: A Look into the Future [47.93555901495955]
We argue that Generative Artificial Intelligence (GenAI) can be used as a means to address the limitations of Model-Based Engineering (MBM&E) We propose that GenAI can be used in MBM&E for: reducing engineers' learning curve, maximizing efficiency with recommendations, or serving as a reasoning tool to understand domain problems.
arXiv Detail & Related papers (2024-07-09T23:13:26Z)
GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models. GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies. We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z)
Generative AI for Visualization: State of the Art and Future Directions [7.273704442256712]
This paper looks back on previous visualization studies leveraging GenAI. By summarizing different generation algorithms, their current applications and limitations, this paper endeavors to provide useful insights for future GenAI4VIS research.
arXiv Detail & Related papers (2024-04-28T11:27:30Z)
Generative AI and Process Systems Engineering: The Next Frontier [0.5937280131734116]
This article explores how emerging generative artificial intelligence (GenAI) models, such as large language models (LLMs), can enhance solution methodologies within process systems engineering (PSE) These cutting-edge GenAI models, particularly foundation models (FMs), are pre-trained on extensive, general-purpose datasets. The article identifies and discusses potential challenges in fully leveraging GenAI within PSE, including multiscale modeling, data requirements, evaluation metrics and benchmarks, and trust and safety.
arXiv Detail & Related papers (2024-02-15T18:20:42Z)
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC) The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z)
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.