Related papers: Robusta: Robust AutoML for Feature Selection via Reinforcement Learning

Robusta: Robust AutoML for Feature Selection via Reinforcement Learning

URL: http://arxiv.org/abs/2101.05950v1
Date: Fri, 15 Jan 2021 03:12:29 GMT
Title: Robusta: Robust AutoML for Feature Selection via Reinforcement Learning
Authors: Xiaoyang Wang, Bo Li, Yibo Zhang, Bhavya Kailkhura, Klara Nahrstedt
Abstract summary: We propose the first robust AutoML framework, Robusta--based on reinforcement learning (RL) We show that the framework is able to improve the model robustness by up to 22% while maintaining competitive accuracy on benign samples.
Score: 24.24652530951966
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several AutoML approaches have been proposed to automate the machine learning (ML) process, such as searching for the ML model architectures and hyper-parameters. However, these AutoML pipelines only focus on improving the learning accuracy of benign samples while ignoring the ML model robustness under adversarial attacks. As ML systems are increasingly being used in a variety of mission-critical applications, improving the robustness of ML systems has become of utmost importance. In this paper, we propose the first robust AutoML framework, Robusta--based on reinforcement learning (RL)--to perform feature selection, aiming to select features that lead to both accurate and robust ML systems. We show that a variation of the 0-1 robust loss can be directly optimized via an RL-based combinatorial search in the feature selection scenario. In addition, we employ heuristics to accelerate the search procedure based on feature scoring metrics, which are mutual information scores, tree-based classifiers feature importance scores, F scores, and Integrated Gradient (IG) scores, as well as their combinations. We conduct extensive experiments and show that the proposed framework is able to improve the model robustness by up to 22% while maintaining competitive accuracy on benign samples compared with other feature selection methods.

Related papers

MLZero: A Multi-Agent System for End-to-end Machine Learning Automation [48.716299953336346]
We introduce MLZero, a novel multi-agent framework powered by Large Language Models (LLMs)<n>A cognitive perception module is first employed, transforming raw multimodal inputs into perceptual context.<n> MLZero demonstrates superior performance on MLE-Bench Lite, outperforming all competitors in both success rate and solution quality.
arXiv Detail & Related papers (2025-05-20T05:20:53Z)
Efficient Model Selection for Time Series Forecasting via LLMs [52.31535714387368]
We propose to leverage Large Language Models (LLMs) as a lightweight alternative for model selection. Our method eliminates the need for explicit performance matrices by utilizing the inherent knowledge and reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-04-02T20:33:27Z)
EDCA - An Evolutionary Data-Centric AutoML Framework for Efficient Pipelines [0.276240219662896]
This work presents EDCA, an Evolutionary Data Centric AutoML framework. Data quality is usually an overlooked part of AutoML and continues to be a manual and time-consuming task. EDCA was compared to FLAML and TPOT, two frameworks at the top of the AutoML benchmarks.
arXiv Detail & Related papers (2025-03-06T11:46:07Z)
Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach [31.654345704242512]
This paper introduces a novel, model-level judge-free self-improvement framework. Our approach employs a controlled feedback mechanism while eliminating the need for MLLMs in the verification loop. We achieve superior precision and recall with significantly lower computational demands.
arXiv Detail & Related papers (2024-11-26T00:44:37Z)
SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning [14.702694298483445]
Tree-Search Enhanced LLM Agents (SELA) is an agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process. SELA represents pipeline configurations as trees, enabling agents to conduct experiments intelligently and iteratively refine their strategies. In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods.
arXiv Detail & Related papers (2024-10-22T17:56:08Z)
Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML) VML constrains the parameter space to be human-interpretable natural language. We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z)
Position: A Call to Action for a Human-Centered AutoML Paradigm [83.78883610871867]
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML) We argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems.
arXiv Detail & Related papers (2024-06-05T15:05:24Z)
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment [88.56809269990625]
We propose a bilevel objective optimistically biased towards potentially high-reward responses to actively explore out-of-distribution regions. Our experimental results demonstrate that when fine-tuned on Zephyr-7B-SFT and Llama-3-8B-Instruct models, Self-Exploring Language Models (SELM) significantly boosts the performance on instruction-following benchmarks.
arXiv Detail & Related papers (2024-05-29T17:59:07Z)
Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms [1.4469725791865982]
This work describes the selection approach and analysis of existing AutoML frameworks regarding their capability of incorporating Quantum Machine Learning (QML) algorithms. For that, available open-source tools are condensed into a market overview and suitable frameworks are systematically selected on a multi-phase, multi-criteria approach. We build an extended Automated Quantum Machine Learning (AutoQML) framework with QC-specific pipeline steps and decision characteristics for hardware and software constraints.
arXiv Detail & Related papers (2023-10-06T13:21:16Z)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations. We study the impact of labeled data through in-context learning and finetuning. We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z)
On Taking Advantage of Opportunistic Meta-knowledge to Reduce Configuration Spaces for Automated Machine Learning [11.670797168818773]
Key research question is whether it is possible and practical to preemptively avoid costly evaluations of poorly performing ML pipelines. Numerous experiments with the AutoWeka4MCPS package suggest that opportunistic/systematic meta-knowledge can improve ML outcomes. We observe strong sensitivity to the challenge' of a dataset, i.e. whether specificity in choosing a predictor leads to significantly better performance.
arXiv Detail & Related papers (2022-08-08T19:22:24Z)
Automatic Componentwise Boosting: An Interpretable AutoML System [1.1709030738577393]
We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm. Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions. Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets.
arXiv Detail & Related papers (2021-09-12T18:34:33Z)
Interpret-able feedback for AutoML systems [5.5524559605452595]
Automated machine learning (AutoML) systems aim to enable training machine learning (ML) models for non-ML experts. A shortcoming of these systems is that when they fail to produce a model with high accuracy, the user has no path to improve the model. We introduce an interpretable data feedback solution for AutoML.
arXiv Detail & Related papers (2021-02-22T18:54:26Z)
Optimization-driven Machine Learning for Intelligent Reflecting Surfaces Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts. Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity. In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.