Data Formulator: AI-powered Concept-driven Visualization Authoring
- URL: http://arxiv.org/abs/2309.10094v2
- Date: Fri, 27 Oct 2023 18:24:24 GMT
- Title: Data Formulator: AI-powered Concept-driven Visualization Authoring
- Authors: Chenglong Wang, John Thompson, Bongshin Lee
- Abstract summary: We present a new visualization paradigm, concept binding, that separates high-level visualization intents and low-level data transformation steps.
With Data Formulator, authors first define data concepts they plan to visualize using natural languages or examples, and then bind them to visual channels.
Data Formulator dispatches its AI-agent to automatically transform the input data to surface these concepts and generate desired visualizations.
- Score: 31.45748186186275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With most modern visualization tools, authors need to transform their data
into tidy formats to create visualizations they want. Because this requires
experience with programming or separate data processing tools, data
transformation remains a barrier in visualization authoring. To address this
challenge, we present a new visualization paradigm, concept binding, that
separates high-level visualization intents and low-level data transformation
steps, leveraging an AI agent. We realize this paradigm in Data Formulator, an
interactive visualization authoring tool. With Data Formulator, authors first
define data concepts they plan to visualize using natural languages or
examples, and then bind them to visual channels. Data Formulator then
dispatches its AI-agent to automatically transform the input data to surface
these concepts and generate desired visualizations. When presenting the results
(transformed table and output visualizations) from the AI agent, Data
Formulator provides feedback to help authors inspect and understand them. A
user study with 10 participants shows that participants could learn and use
Data Formulator to create visualizations that involve challenging data
transformations, and presents interesting future research directions.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - PUB: Plot Understanding Benchmark and Dataset for Evaluating Large Language Models on Synthetic Visual Data Interpretation [2.1184929769291294]
This paper presents a novel synthetic dataset designed to evaluate the proficiency of large language models in interpreting data visualizations.
Our dataset is generated using controlled parameters to ensure comprehensive coverage of potential real-world scenarios.
We employ multimodal text prompts with questions related to visual data in images to benchmark several state-of-the-art models.
arXiv Detail & Related papers (2024-09-04T11:19:17Z) - Data Formulator 2: Iteratively Creating Rich Visualizations with AI [65.48447317310442]
We present Data Formulator 2, an LLM-powered visualization system to address these challenges.
With Data Formulator 2, users describe their visualization intent with blended UI and natural language inputs, and data transformation are delegated to AI.
To support iteration, Data Formulator 2 lets users navigate their iteration history and reuse previous designs towards new ones so that they don't need to start from scratch every time.
arXiv Detail & Related papers (2024-08-28T20:12:17Z) - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - Accountable Textual-Visual Chat Learns to Reject Human Instructions in
Image Re-creation [26.933683814025475]
We introduce two novel multimodal datasets: the synthetic CLEVR-ATVC dataset (620K) and the manually pictured Fruit-ATVC dataset (50K).
These datasets incorporate both visual and text-based inputs and outputs.
To facilitate the accountability of multimodal systems in rejecting human requests, similar to language-based ChatGPT conversations, we introduce specific rules as supervisory signals within the datasets.
arXiv Detail & Related papers (2023-03-10T15:35:11Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Shuffler: A Large Scale Data Management Tool for ML in Computer Vision [0.0]
We present Shuffler, an open source tool that makes it easy to manage large computer vision datasets.
Shuffler defines over 40 data handling operations with annotations that are commonly useful in supervised learning applied to computer vision.
arXiv Detail & Related papers (2021-04-11T22:27:28Z) - Visual Distant Supervision for Scene Graph Generation [66.10579690929623]
Scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation.
We propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
Comprehensive experimental results show that our distantly supervised model outperforms strong weakly supervised and semi-supervised baselines.
arXiv Detail & Related papers (2021-03-29T06:35:24Z) - Advancing Visual Specification of Code Requirements for Graphs [0.0]
This paper focuses on producing meaningful visualizations of data using machine learning.
We allow the user to visually specify their code requirements in order to lower the barrier for humanities researchers to learn how to program visualizations.
We use a hybrid model, combining a neural network and optical character recognition to generate the code to create the visualization.
arXiv Detail & Related papers (2020-07-29T17:01:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.