Testing the Robustness of AutoML Systems
- URL: http://arxiv.org/abs/2005.02649v2
- Date: Thu, 23 Jul 2020 01:32:38 GMT
- Title: Testing the Robustness of AutoML Systems
- Authors: Tuomas Halvari, Jukka K. Nurminen, Tommi Mikkonen
- Abstract summary: We investigate the robustness of machine learning pipelines generated with three AutoML systems, TPOT, H2O, and AutoKeras.
In particular, we study the influence of dirty data on accuracy, and consider how using dirty training data may help create more robust solutions.
- Score: 5.942234058526296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated machine learning (AutoML) systems aim at finding the best machine
learning (ML) pipeline that automatically matches the task and data at hand. We
investigate the robustness of machine learning pipelines generated with three
AutoML systems, TPOT, H2O, and AutoKeras. In particular, we study the influence
of dirty data on accuracy, and consider how using dirty training data may help
create more robust solutions. Furthermore, we also analyze how the structure of
the generated pipelines differs in different cases.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - XAutoML: A Visual Analytics Tool for Understanding and Validating
Automated Machine Learning [5.633209323925663]
XAutoML is an interactive visual analytics tool for explaining arbitrary AutoML optimization procedures and ML pipelines constructed by AutoML.
XAutoML combines interactive visualizations with established techniques from explainable artificial intelligence (XAI) to make the complete AutoML procedure transparent and explainable.
arXiv Detail & Related papers (2022-02-24T08:18:25Z) - Towards Green Automated Machine Learning: Status Quo and Future
Directions [71.86820260846369]
AutoML is being criticised for its high resource consumption.
This paper proposes Green AutoML, a paradigm to make the whole AutoML process more environmentally friendly.
arXiv Detail & Related papers (2021-11-10T18:57:27Z) - Man versus Machine: AutoML and Human Experts' Role in Phishing Detection [4.124446337711138]
This paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets.
Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks.
arXiv Detail & Related papers (2021-08-27T09:26:20Z) - Detecting Faults during Automatic Screwdriving: A Dataset and Use Case
of Anomaly Detection for Automatic Screwdriving [80.6725125503521]
Data-driven approaches, using Machine Learning (ML) for detecting faults have recently gained increasing interest.
We present a use case of using ML models for detecting faults during automated screwdriving operations.
arXiv Detail & Related papers (2021-07-05T11:46:00Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - Fits and Starts: Enterprise Use of AutoML and the Role of Humans in the
Loop [4.468952886990851]
AutoML systems can speed up routine data science work and make machine learning available to those without expertise in statistics and computer science.
We conduct interviews with 29 individuals from organizations of different sizes to characterize how they currently use, or intend to use, AutoML systems.
Our findings have implications for the design and implementation of human-in-the-loop visual analytics approaches.
arXiv Detail & Related papers (2021-01-12T04:52:48Z) - AutoML to Date and Beyond: Challenges and Opportunities [30.60364966752454]
AutoML tools aim to make machine learning accessible for non-machine learning experts.
We introduce a new classification system for AutoML systems.
We lay out a roadmap for the future, pinpointing the research required to further automate the end-to-end machine learning pipeline.
arXiv Detail & Related papers (2020-10-21T06:08:21Z) - Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical
Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines.
The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.