Fugu-MT 論文翻訳(概要): Towards Robust Training in NNGPT AutoML Pipeline: A Loss-Optimizer Pairing Selection Study

論文の概要: Towards Robust Training in NNGPT AutoML Pipeline: A Loss-Optimizer Pairing Selection Study

arxiv url: http://arxiv.org/abs/2606.20933v1
Date: Thu, 18 Jun 2026 20:51:42 GMT
ステータス: 情報取得中
システム内更新日: 2026-06-23 11:29:00.081093
Title: Towards Robust Training in NNGPT AutoML Pipeline: A Loss-Optimizer Pairing Selection Study
Title（参考訳）: NNGPTオートMLパイプラインにおけるロバストトレーニングに向けて:ロスオプティマイザペアリング選択研究
Authors: Anton Abramochkin, Radu Timofte, Dmitry Ignatov,
Abstract要約: 本稿では, 一つのレシピがヘテロジニアスなアーキテクチャプールに十分であるか, 最適ペアリングが構造的に多様なモデルによって異なるかを検討する。我々は,CEL(Cross-Entropy),NLL(Negative Log-Likelihood),および最近導入された遺伝学的に進化したNGL損失を,LEMURヘテロジニアス・アーキテクチャー・プールの6つの画像分類データセット上に提示したベースモデル間で比較検討した。我々の結果は、単一のペアリングが普遍的に最適でないことを確認した。AdamやAdamWとのクロスエントロピーは、最も堅牢な選択である。
参考スコア（独自算出の注目度）: 48.83701310501069
License:
Abstract: The choice of loss function and optimizer is an important decision, that shapes further model training. Yet automated architecture search pipelines (AutoML) benefits significantly more from the optimal pairing selection and vice versa. This paper investigates whether a single recipe is sufficient for heterogeneous architecture pools, or whether the optimal pairing varies across structurally diverse models. We conduct a systematic empirical study of all $3 \times 6 = 18$ combinations of six optimizers (SGD+Momentum, Adam, AdamW, RMSprop, Adagrad, Adadelta), paired with three loss functions: Cross-Entropy (CEL), Negative Log-Likelihood (NLL), and the recently introduced genetically evolved NGL loss across the base models presented in LEMUR heterogeneous architecture pool on six image classification datasets (CelebA-Gender, CIFAR-10, CIFAR-100, ImageNette, MNIST, SVHN). The 18 loss-optimizer configurations are applied to each of the 33 compatible base architectures taken from the LEMUR pool, resulting in 594 variants that were generated fully automatically by a source-level injection pipeline and evaluated under fixed hyperparameters, ensuring that observed accuracy differences are attributable solely to the loss-optimizer pairing. Our results confirm that no single pairing is universally optimal. Cross-Entropy with Adam or AdamW is the most robust choice across architecture families and datasets. NGL is a competitive alternative to CEL on standard convolutional classifiers, but only when paired with adaptive optimizers; it degrades substantially with SGD or accumulation-based methods. Adagrad and Adadelta consistently underperform under fixed hyperparameters regardless of loss function, highlighting their sensitivity to learning rate tuning. These findings provide actionable guidance for loss-optimizer selection within NNGPT Framework.
Abstract（参考訳）: 損失関数とオプティマイザの選択は、さらなるモデルトレーニングを形成する重要な決定である。しかし、自動アーキテクチャ検索パイプライン(AutoML)は、最適なペアリング選択とその逆のメリットにより、大きなメリットがある。本稿では, 一つのレシピがヘテロジニアスなアーキテクチャプールに十分であるか, 最適ペアリングが構造的に多様なモデルによって異なるかを検討する。 6つの画像分類データセット(CelebA-Gender, CIFAR-10, CIFAR-100, ImageNette, MNIST, SVHNN, SVHN)上のLEMURヘテロジニアス・アーキテクチャー・プールに提示されたベースモデル全体にわたる遺伝学的に進化したNGL損失を、CEL(Cross-Entropy)、NLL(Negative Log-Likelihood)、NLL(Negative Log-Likelihood)の3つの損失関数と組み合わせて、SGD+Momentum(SGD+Momentum)、AdamW、RMSprop、Adagrad、Adadelta)の6つのオプティマイザ(SGD+Momentum)の組み合わせを体系的に検討した。その結果、ソースレベルのインジェクションパイプラインによって自動的に生成され、固定されたハイパーパラメータで評価され、観測された精度の違いが損失-オプティマイザペアリングのみに起因することが保証された。以上の結果から,単一のペアリングが普遍的に最適でないことが確認された。 AdamやAdamWとのクロスエントロピーは、アーキテクチャファミリやデータセットの中で最も堅牢な選択である。 NGLは標準畳み込み分類器におけるCELの競合的な代替品であるが、適応最適化器と組み合わせた場合に限られる。 AdagradとAdadeltaは、損失関数に関わらず、固定されたハイパーパラメータの下で一貫してパフォーマンスが低下し、学習速度チューニングに対する感受性が強調された。これらの知見はNNGPTフレームワーク内での損失最適化のための実用的なガイダンスを提供する。

論文の概要: Towards Robust Training in NNGPT AutoML Pipeline: A Loss-Optimizer Pairing Selection Study

関連論文リスト