Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the
Loop
- URL: http://arxiv.org/abs/2101.04296v1
- Date: Tue, 12 Jan 2021 04:52:48 GMT
- Title: Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the
Loop
- Authors: Anamaria Crisan, Brittany Fiore-Gartland
- Abstract summary: AutoML systems can speed up routine data science work and make machine learning available to those without expertise in statistics and computer science.
We conduct interviews with 29 individuals from organizations of different sizes to characterize how they currently use, or intend to use, AutoML systems.
Our findings have implications for the design and implementation of human-in-the-loop visual analytics approaches.
- Score: 4.468952886990851
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AutoML systems can speed up routine data science work and make machine
learning available to those without expertise in statistics and computer
science. These systems have gained traction in enterprise settings where pools
of skilled data workers are limited. In this study, we conduct interviews with
29 individuals from organizations of different sizes to characterize how they
currently use, or intend to use, AutoML systems in their data science work. Our
investigation also captures how data visualization is used in conjunction with
AutoML systems. Our findings identify three usage scenarios for AutoML that
resulted in a framework summarizing the level of automation desired by data
workers with different levels of expertise. We surfaced the tension between
speed and human oversight and found that data visualization can do a poor job
balancing the two. Our findings have implications for the design and
implementation of human-in-the-loop visual analytics approaches.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - DriveLM: Driving with Graph Visual Question Answering [57.51930417790141]
We study how vision-language models (VLMs) trained on web-scale data can be integrated into end-to-end driving systems.
We propose a VLM-based baseline approach (DriveLM-Agent) for jointly performing Graph VQA and end-to-end driving.
arXiv Detail & Related papers (2023-12-21T18:59:12Z) - Assessing the Use of AutoML for Data-Driven Software Engineering [10.40771687966477]
AutoML promises to automate the building of end-to-end AI/ML pipelines.
Despite the growing interest and high expectations, there is a dearth of information about the extent to which AutoML is currently adopted.
arXiv Detail & Related papers (2023-07-20T11:14:24Z) - Demonstration of InsightPilot: An LLM-Empowered Automated Data
Exploration System [48.62158108517576]
We introduce InsightPilot, an automated data exploration system designed to simplify the data exploration process.
InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining.
In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts.
arXiv Detail & Related papers (2023-04-02T07:27:49Z) - OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques.
We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z) - Man versus Machine: AutoML and Human Experts' Role in Phishing Detection [4.124446337711138]
This paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets.
Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks.
arXiv Detail & Related papers (2021-08-27T09:26:20Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - AutoDS: Towards Human-Centered Automation of Data Science [20.859067294445985]
This paper introduces AutoDS, an automated machine learning (AutoML) system to support data science projects.
As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores.
arXiv Detail & Related papers (2021-01-13T08:35:14Z) - AutoML to Date and Beyond: Challenges and Opportunities [30.60364966752454]
AutoML tools aim to make machine learning accessible for non-machine learning experts.
We introduce a new classification system for AutoML systems.
We lay out a roadmap for the future, pinpointing the research required to further automate the end-to-end machine learning pipeline.
arXiv Detail & Related papers (2020-10-21T06:08:21Z) - Testing the Robustness of AutoML Systems [5.942234058526296]
We investigate the robustness of machine learning pipelines generated with three AutoML systems, TPOT, H2O, and AutoKeras.
In particular, we study the influence of dirty data on accuracy, and consider how using dirty training data may help create more robust solutions.
arXiv Detail & Related papers (2020-05-06T08:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.