Fugu-MT 論文翻訳(概要): LLMSYS-HPOBench: Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems

論文の概要: LLMSYS-HPOBench: Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems

arxiv url: http://arxiv.org/abs/2605.08305v1
Date: Fri, 08 May 2026 12:53:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:49.551954
Title: LLMSYS-HPOBench: Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems
Title（参考訳）: LLMSYS-HPOBench:Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems
Authors: Siyu Wu, Yulong Ye, Zezhen Xiang, Pengzhou Chen, Gangda Xiong, Tao Chen,
Abstract要約: 大規模言語モデル(LLM)システムは、多くのアプリケーションドメインにおけるAIのフロンティアである。本稿では,実世界のLLMシステムのHPOのためのベンチマークスイートとデータセットについて述べる。
参考スコア（独自算出の注目度）: 6.176867520244386
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLM) systems have been the frontier of AI in many application domains, leading to new challenges and opportunities for hyperparameter optimization (HPO) for the AutoML community. However, this type of system exhibits an unprecedented compound space of hyperparameter configuration from both the AI and non-AI components; rich and nonlinear implications from the fidelity factors; and diverse costs of measuring hyperparameter configurations, none of which have been fully captured in existing benchmarks. This paper presents the first (live) benchmark suite and datasets for HPO of real-world LLM systems, dubbed LLMSYS-HPOBench, covering data related to the inference objective values of hyperparameter configurations profiled from running the LLM systems. Currently, LLMSYS-HPOBench contains 364,450 hyperparameter configurations with a dimensionality of 12-23, 3-5 dimensions of fidelity factor leading to 932 settings, 3-9 inference objective metrics, and 2-10 cost metrics, together with generated logs from measuring the LLM systems. What we seek to advocate is not only a revalidation of the existing HPO algorithms over the frontier LLM systems, but also to provide an evolving platform for the AutoML community to explore new directions of research in this regard. The benchmark suite has been made available at: https://github.com/ideas-labo/llmsys-hpobench
Abstract（参考訳）: 大規模言語モデル(LLM)システムは、多くのアプリケーションドメインにおけるAIのフロンティアであり、AutoMLコミュニティにおけるハイパーパラメータ最適化(HPO)の新たな課題と機会につながっている。しかし、この種のシステムは、AIコンポーネントと非AIコンポーネントの両方からのハイパーパラメータ構成の既往の複合空間、忠実度要因からのリッチで非線形な含意、ハイパーパラメータ構成を測定するための多種多様なコストを示しており、いずれも既存のベンチマークでは完全に捉えられていない。本稿では、LLMSYS-HPOBenchと呼ばれる実世界のLLMシステムのHPOのための最初の(ライブ)ベンチマークスイートとデータセットについて述べる。現在、LLMSYS-HPOBenchは、次元が12-23の364,450のハイパーパラメータ構成、932のセッティングに繋がる3,5次元の忠実度係数、3,9の推論目標メトリクス、2-10のコストメトリクス、およびLLMシステムの測定から生成されたログを含む。我々が提唱しようとしているのは、フロンティアのLLMシステムに対する既存のHPOアルゴリズムの再検証だけでなく、AutoMLコミュニティがこの点における新たな研究方向を探求するための、進化したプラットフォームを提供することです。ベンチマークスイートは、https://github.com/ideas-labo/llmsys-hpobench.comで公開されている。

論文の概要: LLMSYS-HPOBench: Hyperparameter Optimization Benchmark Suite for Real-World LLM Systems

関連論文リスト