Towards Automatic Translation of Machine Learning Visual Insights to
  Analytical Assertions
        - URL: http://arxiv.org/abs/2401.07696v1
- Date: Mon, 15 Jan 2024 14:11:59 GMT
- Title: Towards Automatic Translation of Machine Learning Visual Insights to
  Analytical Assertions
- Authors: Arumoy Shome and Luis Cruz and Arie van Deursen
- Abstract summary: We present our vision for developing an automated tool capable of translating visual properties observed in Machine Learning (ML) visualisations into Python assertions.
The tool aims to streamline the process of manually verifying these visualisations in the ML development cycle, which is critical as real-world data and assumptions often change post-deployment.
- Score: 23.535630175567146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   We present our vision for developing an automated tool capable of translating
visual properties observed in Machine Learning (ML) visualisations into Python
assertions. The tool aims to streamline the process of manually verifying these
visualisations in the ML development cycle, which is critical as real-world
data and assumptions often change post-deployment. In a prior study, we mined
$54,070$ Jupyter notebooks from Github and created a catalogue of $269$
semantically related visualisation-assertion (VA) pairs. Building on this
catalogue, we propose to build a taxonomy that organises the VA pairs based on
ML verification tasks. The input feature space comprises of a rich source of
information mined from the Jupyter notebooks -- visualisations, Python source
code, and associated markdown text. The effectiveness of various AI models,
including traditional NLP4Code models and modern Large Language Models, will be
compared using established machine translation metrics and evaluated through a
qualitative study with human participants. The paper also plans to address the
challenge of extending the existing VA pair dataset with additional pairs from
Kaggle and to compare the tool's effectiveness with commercial generative AI
models like ChatGPT. This research not only contributes to the field of ML
system validation but also explores novel ways to leverage AI for automating
and enhancing software engineering practices in ML.
 
      
        Related papers
        - Automated Generation of Commit Messages in Software Repositories [0.7366405857677226]
 Commit messages are crucial for documenting software changes, aiding in program comprehension and maintenance.
Our research presents an automated approach to generate commit messages using Machine Learning (ML) and Natural Language Processing (NLP)
We used the dataset of code changes and corresponding commit messages that was used by Liu et al.
 arXiv  Detail & Related papers  (2025-04-17T15:08:05Z)
- Training of Scaffolded Language Models with Language Supervision: A   Survey [62.59629932720519]
 This survey organizes the literature on the design and optimization of emerging structures around post-trained LMs.<n>We refer to this overarching structure as scaffolded LMs and focus on LMs that are integrated into multi-step processes with tools.
 arXiv  Detail & Related papers  (2024-10-21T18:06:25Z)
- iGAiVA: Integrated Generative AI and Visual Analytics in a Machine   Learning Workflow for Text Classification [2.0094862015890245]
 We present a solution for using visual analytics (VA) to guide the generation of synthetic data using large language models.
We discuss different types of data deficiency, describe different VA techniques for supporting their identification, and demonstrate the effectiveness of targeted data synthesis.
 arXiv  Detail & Related papers  (2024-09-24T08:19:45Z)
- PUB: Plot Understanding Benchmark and Dataset for Evaluating Large   Language Models on Synthetic Visual Data Interpretation [2.1184929769291294]
 This paper presents a novel synthetic dataset designed to evaluate the proficiency of large language models in interpreting data visualizations.
Our dataset is generated using controlled parameters to ensure comprehensive coverage of potential real-world scenarios.
We employ multimodal text prompts with questions related to visual data in images to benchmark several state-of-the-art models.
 arXiv  Detail & Related papers  (2024-09-04T11:19:17Z)
- AIDE: An Automatic Data Engine for Object Detection in Autonomous   Driving [68.73885845181242]
 We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
 arXiv  Detail & Related papers  (2024-03-26T04:27:56Z)
- Deciphering AutoML Ensembles: cattleia's Assistance in Decision-Making [0.0]
 Cattleia is an application that deciphers the ensembles for regression, multiclass, and binary classification tasks.
It works with models built by three AutoML packages: auto-sklearn, AutoGluon, and FLAML.
 arXiv  Detail & Related papers  (2024-03-19T11:56:21Z)
- Quantitative Assurance and Synthesis of Controllers from Activity
  Diagrams [4.419843514606336]
 Probabilistic model checking is a widely used formal verification technique to automatically verify qualitative and quantitative properties.
This makes it not accessible for researchers and engineers who may not have the required knowledge.
We propose a comprehensive verification framework for ADs, including a new profile for probability time, quality annotations, a semantics interpretation of ADs in three Markov models, and a set of transformation rules from activity diagrams to the PRISM language.
Most importantly, we developed algorithms for transformation and implemented them in a tool, called QASCAD, using model-based techniques, for fully automated verification.
 arXiv  Detail & Related papers  (2024-02-29T22:40:39Z)
- Voila-A: Aligning Vision-Language Models with User's Gaze Attention [56.755993500556734]
 We introduce gaze information as a proxy for human attention to guide Vision-Language Models (VLMs)
We propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.
 arXiv  Detail & Related papers  (2023-12-22T17:34:01Z)
- The Devil is in the Errors: Leveraging Large Language Models for
  Fine-grained Machine Translation Evaluation [93.01964988474755]
 AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
 arXiv  Detail & Related papers  (2023-08-14T17:17:21Z)
- AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
 We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline.
We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters.
This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
 arXiv  Detail & Related papers  (2023-05-04T02:09:43Z)
- ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented
  Visual Models [102.63817106363597]
 We build ELEVATER, the first benchmark to compare and evaluate pre-trained language-augmented visual models.
It consists of 20 image classification datasets and 35 object detection datasets, each of which is augmented with external knowledge.
We will release our toolkit and evaluation platforms for the research community.
 arXiv  Detail & Related papers  (2022-04-19T10:23:42Z)
- Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
 We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
 arXiv  Detail & Related papers  (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.