Fugu-MT 論文翻訳(概要): Current Challenges of Symbolic Regression: Optimization, Selection, Model Simplification, and Benchmarking

論文の概要: Current Challenges of Symbolic Regression: Optimization, Selection, Model Simplification, and Benchmarking

arxiv url: http://arxiv.org/abs/2512.01682v1
Date: Mon, 01 Dec 2025 13:48:07 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-02 19:46:34.878313
Title: Current Challenges of Symbolic Regression: Optimization, Selection, Model Simplification, and Benchmarking
Title（参考訳）: シンボリック回帰の最近の課題:最適化、選択、モデルの単純化、ベンチマーク
Authors: Guilherme Seidyo Imai Aldeia,
Abstract要約: 記号回帰(SR)は、変数間の関係を記述する数学的表現を見つけることを目的としている。現在の手法はSRの景観を理解するために常に再評価されなければならない。この論文は、博士課程全体で行われた一連の研究を通じて、これらの課題に対処する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Symbolic Regression (SR) is a regression method that aims to discover mathematical expressions that describe the relationship between variables, and it is often implemented through Genetic Programming, a metaphor for the process of biological evolution. Its appeal lies in combining predictive accuracy with interpretable models, but its promise is limited by several long-standing challenges: parameters are difficult to optimize, the selection of solutions can affect the search, and models often grow unnecessarily complex. In addition, current methods must be constantly re-evaluated to understand the SR landscape. This thesis addresses these challenges through a sequence of studies conducted throughout the doctorate, each focusing on an important aspect of the SR search process. First, I investigate parameter optimization, obtaining insights into its role in improving predictive accuracy, albeit with trade-offs in runtime and expression size. Next, I study parent selection, exploring $ε$-lexicase to select parents more likely to generate good performing offspring. The focus then turns to simplification, where I introduce a novel method based on memoization and locality-sensitive hashing that reduces redundancy and yields simpler, more accurate models. All of these contributions are implemented into a multi-objective evolutionary SR library, which achieves Pareto-optimal performance in terms of accuracy and simplicity on benchmarks of real-world and synthetic problems, outperforming several contemporary SR approaches. The thesis concludes by proposing changes to a famous large-scale symbolic regression benchmark suite, then running the experiments to assess the symbolic regression landscape, demonstrating that a SR method with the contributions presented in this thesis achieves Pareto-optimal performance.
Abstract（参考訳）: シンボリック回帰(英: Symbolic Regression、SR)は、変数間の関係を記述する数学的表現の発見を目的とした回帰法であり、しばしば生物学的進化の過程の比喩である遺伝的プログラミングによって実装される。その魅力は、予測精度と解釈可能なモデルを組み合わせることであるが、その約束は、パラメータの最適化が困難で、解の選択が探索に影響を与えることがあり、モデルは必要以上に複雑に成長する。さらに、現在の手法はSRの景観を理解するために常に再評価されなければならない。この論文は、これらの課題を博士課程全体を通して一連の研究を通じて解決し、それぞれがSR探索プロセスの重要な側面に焦点を当てている。まず,パラメータ最適化について検討し,予測精度の向上に果たす役割について考察する。次に、私は親の選択を研究し、親を選別するために$ε$-lexicaseを探索します。そこで、メモ化と局所性に敏感なハッシュに基づく新しい手法を導入し、冗長性を低減し、よりシンプルで正確なモデルを生み出す。これらのコントリビューションはすべて多目的進化SRライブラリに実装され、現実問題や合成問題のベンチマークにおける精度と単純さの観点からパレート最適性能を達成し、現代のSRアプローチよりも優れている。この論文は、有名な大規模な記号的回帰ベンチマークスイートの変更を提案し、その実験を実行して、象徴的回帰の景観を評価することで、この論文で提示された貢献を伴うSR法がパレート最適性能を達成することを実証することで、結論付けている。

論文の概要: Current Challenges of Symbolic Regression: Optimization, Selection, Model Simplification, and Benchmarking

関連論文リスト