Related papers: Towards Automatic Translation of Machine Learning Visual Insights to Analytical Assertions

Towards Automatic Translation of Machine Learning Visual Insights to Analytical Assertions

URL: http://arxiv.org/abs/2401.07696v1
Date: Mon, 15 Jan 2024 14:11:59 GMT
Title: Towards Automatic Translation of Machine Learning Visual Insights to Analytical Assertions
Authors: Arumoy Shome and Luis Cruz and Arie van Deursen
Abstract summary: We present our vision for developing an automated tool capable of translating visual properties observed in Machine Learning (ML) visualisations into Python assertions. The tool aims to streamline the process of manually verifying these visualisations in the ML development cycle, which is critical as real-world data and assumptions often change post-deployment.
Score: 23.535630175567146
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present our vision for developing an automated tool capable of translating visual properties observed in Machine Learning (ML) visualisations into Python assertions. The tool aims to streamline the process of manually verifying these visualisations in the ML development cycle, which is critical as real-world data and assumptions often change post-deployment. In a prior study, we mined $54,070$ Jupyter notebooks from Github and created a catalogue of $269$ semantically related visualisation-assertion (VA) pairs. Building on this catalogue, we propose to build a taxonomy that organises the VA pairs based on ML verification tasks. The input feature space comprises of a rich source of information mined from the Jupyter notebooks -- visualisations, Python source code, and associated markdown text. The effectiveness of various AI models, including traditional NLP4Code models and modern Large Language Models, will be compared using established machine translation metrics and evaluated through a qualitative study with human participants. The paper also plans to address the challenge of extending the existing VA pair dataset with additional pairs from Kaggle and to compare the tool's effectiveness with commercial generative AI models like ChatGPT. This research not only contributes to the field of ML system validation but also explores novel ways to leverage AI for automating and enhancing software engineering practices in ML.

Related papers

Automated Generation of Commit Messages in Software Repositories [0.7366405857677226]
Commit messages are crucial for documenting software changes, aiding in program comprehension and maintenance. Our research presents an automated approach to generate commit messages using Machine Learning (ML) and Natural Language Processing (NLP) We used the dataset of code changes and corresponding commit messages that was used by Liu et al.
arXiv Detail & Related papers (2025-04-17T15:08:05Z)
iGAiVA: Integrated Generative AI and Visual Analytics in a Machine Learning Workflow for Text Classification [2.0094862015890245]
We present a solution for using visual analytics (VA) to guide the generation of synthetic data using large language models. We discuss different types of data deficiency, describe different VA techniques for supporting their identification, and demonstrate the effectiveness of targeted data synthesis.
arXiv Detail & Related papers (2024-09-24T08:19:45Z)
PUB: Plot Understanding Benchmark and Dataset for Evaluating Large Language Models on Synthetic Visual Data Interpretation [2.1184929769291294]
This paper presents a novel synthetic dataset designed to evaluate the proficiency of large language models in interpreting data visualizations. Our dataset is generated using controlled parameters to ensure comprehensive coverage of potential real-world scenarios. We employ multimodal text prompts with questions related to visual data in images to benchmark several state-of-the-art models.
arXiv Detail & Related papers (2024-09-04T11:19:17Z)
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z)
Deciphering AutoML Ensembles: cattleia's Assistance in Decision-Making [0.0]
Cattleia is an application that deciphers the ensembles for regression, multiclass, and binary classification tasks. It works with models built by three AutoML packages: auto-sklearn, AutoGluon, and FLAML.
arXiv Detail & Related papers (2024-03-19T11:56:21Z)
Quantitative Assurance and Synthesis of Controllers from Activity Diagrams [4.419843514606336]
Probabilistic model checking is a widely used formal verification technique to automatically verify qualitative and quantitative properties. This makes it not accessible for researchers and engineers who may not have the required knowledge. We propose a comprehensive verification framework for ADs, including a new profile for probability time, quality annotations, a semantics interpretation of ADs in three Markov models, and a set of transformation rules from activity diagrams to the PRISM language. Most importantly, we developed algorithms for transformation and implemented them in a tool, called QASCAD, using model-based techniques, for fully automated verification.
arXiv Detail & Related papers (2024-02-29T22:40:39Z)
Voila-A: Aligning Vision-Language Models with User's Gaze Attention [56.755993500556734]
We introduce gaze information as a proxy for human attention to guide Vision-Language Models (VLMs) We propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.
arXiv Detail & Related papers (2023-12-22T17:34:01Z)
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations. We study the impact of labeled data through in-context learning and finetuning. We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z)
AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline. We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters. This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z)
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models [102.63817106363597]
We build ELEVATER, the first benchmark to compare and evaluate pre-trained language-augmented visual models. It consists of 20 image classification datasets and 35 object detection datasets, each of which is augmented with external knowledge. We will release our toolkit and evaluation platforms for the research community.
arXiv Detail & Related papers (2022-04-19T10:23:42Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.