Large Language Model Agent for Hyper-Parameter Optimization
- URL: http://arxiv.org/abs/2402.01881v2
- Date: Tue, 6 Feb 2024 15:03:09 GMT
- Title: Large Language Model Agent for Hyper-Parameter Optimization
- Authors: Siyi Liu, Chen Gao, Yong Li
- Abstract summary: We introduce a novel paradigm leveraging Large Language Models (LLMs) to automate hyperparameter optimization across diverse machine learning tasks.
AgentHPO processes the task information autonomously, conducts experiments with specific hyper parameters, and iteratively optimize them.
This human-like optimization process largely reduces the number of required trials, simplifies the setup process, and enhances interpretability and user trust.
- Score: 30.560250427498243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperparameter optimization is critical in modern machine learning, requiring
expert knowledge, numerous trials, and high computational and human resources.
Despite the advancements in Automated Machine Learning (AutoML), challenges in
terms of trial efficiency, setup complexity, and interoperability still
persist. To address these issues, we introduce a novel paradigm leveraging
Large Language Models (LLMs) to automate hyperparameter optimization across
diverse machine learning tasks, which is named AgentHPO (short for LLM
Agent-based Hyperparameter Optimization). Specifically, AgentHPO processes the
task information autonomously, conducts experiments with specific
hyperparameters (HPs), and iteratively optimizes them based on historical
trials. This human-like optimization process largely reduces the number of
required trials, simplifies the setup process, and enhances interpretability
and user trust, compared to traditional AutoML methods. Extensive empirical
experiments conducted on 12 representative machine-learning tasks indicate that
AgentHPO not only matches but also often surpasses the best human trials in
terms of performance while simultaneously providing explainable results.
Further analysis sheds light on the strategies employed by the LLM in
optimizing these tasks, highlighting its effectiveness and adaptability in
various scenarios.
Related papers
- AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval [93.96463520716759]
Large language model (LLM) agents have demonstrated impressive capability in utilizing external tools and knowledge to boost accuracy and reduce hallucinations.
Here, we introduce AvaTaR, a novel framework that optimize an LLM agent to effectively use the provided tools and improve its performance on a given task/domain.
We find AvaTaR consistently outperforms state-of-the-art approaches across all four challenging tasks and exhibits strong generalization ability when applied to novel cases.
arXiv Detail & Related papers (2024-06-17T04:20:02Z) - Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning [56.82041895921434]
Open-source pre-trained Large Language Models (LLMs) exhibit strong language understanding and generation capabilities.
When used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4.
arXiv Detail & Related papers (2024-03-29T03:48:12Z) - LLM can Achieve Self-Regulation via Hyperparameter Aware Generation [88.69052513433603]
Large Language Models (LLMs) employ diverse decoding strategies to control the generated text.
Are LLMs conscious of the existence of these decoding strategies and capable of regulating themselves?
We propose a novel text generation paradigm termed Hyperparameter Aware Generation (HAG)
arXiv Detail & Related papers (2024-02-17T11:18:22Z) - Can LLMs Configure Software Tools [0.76146285961466]
In software engineering, the meticulous configuration of software tools is crucial in ensuring optimal performance within intricate systems.
In this study, we embark on an exploration of leveraging Large-Language Models (LLMs) to streamline the software configuration process.
Our work presents a novel approach that employs LLMs, such as Chat-GPT, to identify starting conditions and narrow down the search space, improving configuration efficiency.
arXiv Detail & Related papers (2023-12-11T05:03:02Z) - AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline.
We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters.
This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z) - Deep Ranking Ensembles for Hyperparameter Optimization [9.453554184019108]
We present a novel method that meta-learns neural network surrogates optimized for ranking the configurations' performances while modeling their uncertainty via ensembling.
In a large-scale experimental protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks, we demonstrate that our method achieves new state-of-the-art results in HPO.
arXiv Detail & Related papers (2023-03-27T13:52:40Z) - Two-step hyperparameter optimization method: Accelerating hyperparameter
search by using a fraction of a training dataset [0.15420205433587747]
We present a two-step HPO method as a strategic solution to curbing computational demands and wait times.
We present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation.
arXiv Detail & Related papers (2023-02-08T02:38:26Z) - Speeding Up Multi-Objective Hyperparameter Optimization by Task
Similarity-Based Meta-Learning for the Tree-Structured Parzen Estimator [37.553558410770314]
In this paper, we extend TPE's acquisition function to the meta-learning setting using a task similarity defined by the overlap of top domains between tasks.
In the experiments, we demonstrate that our method speeds up MO-TPE on tabular HPO benchmarks and attains state-of-the-art performance.
arXiv Detail & Related papers (2022-12-13T17:33:02Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - Multi-Objective Hyperparameter Optimization in Machine Learning -- An Overview [10.081056751778712]
We introduce the basics of multi-objective hyperparameter optimization and motivate its usefulness in applied ML.
We provide an extensive survey of existing optimization strategies, both from the domain of evolutionary algorithms and Bayesian optimization.
We illustrate the utility of MOO in several specific ML applications, considering objectives such as operating conditions, prediction time, sparseness, fairness, interpretability and robustness.
arXiv Detail & Related papers (2022-06-15T10:23:19Z) - Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for
Hyperparameter Recommendation [83.85021205445662]
We propose an instantiation--amortized auto-tuning (AT2) to speed up tuning of machine learning models.
We conduct a thorough analysis of the multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation--amortized auto-tuning (AT2)
arXiv Detail & Related papers (2021-06-17T00:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.