Fugu-MT 論文翻訳(概要): LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

論文の概要: LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

arxiv url: http://arxiv.org/abs/2606.09004v2
Date: Tue, 16 Jun 2026 11:38:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-17 15:01:46.428931
Title: LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)
Title（参考訳）: LATTEArena: LLMによる口唇機能評価フレームワーク(拡張版)
Authors: Ankai Hao, Ke Chen, Huan Li, Lidan Shou,
Abstract要約: LATTEArenaは、自動機能エンジニアリングのための標準化、モジュール化、およびベンチマークフレームワークである。制御されたコンポーネントレベルの比較を可能にすることで、LATTEArenaはパラダイムをアドホックなプロンプトエンジニアリングから、システマティックなコンテキスト管理へとシフトする。すべてのコード、データセット、4000以上の実行ログは、動的でコミュニティ主導のベンチマークを育むために公開されています。
参考スコア（独自算出の注目度）: 21.873402103599588
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feature engineering remains a cornerstone of tabular data analysis, and Large Language Models (LLMs) have emerged as a promising paradigm for its automation, giving rise to LLM-powered Automated Tabular Feature Engineering (LATTE). However, the field lacks standardized, cost-aware evaluation platforms, and the combinatorial explosion of design choices obscures true algorithmic progress. To bridge these gaps, we systematically deconstruct 15 representative LATTE methods into a unified 6-dimensional taxonomy. Based on this abstraction, we introduce LATTEArena, a standardized, modular, and extensible benchmarking framework that decouples monolithic pipelines into reusable execution blocks. By distilling the massive combinatorial space, we evaluate 24 core LATTE configurations across 7 research questions. Our head-to-head benchmarking goes beyond predictive accuracy to quantify token efficiency and execution robustness, yielding 17 empirical findings on cost-effectiveness trade-offs. Furthermore, we provide 3 concrete recommendations for optimal real-world deployment. By enabling controlled component-level comparisons, LATTEArena shifts the paradigm from ad-hoc prompt engineering to systematic context management. All code, datasets, and over 4,000 execution logs are publicly available to foster a dynamic, community-driven benchmark. Our framework, leaderboard, and all artifacts are hosted on the LATTEArena project website at https://goodenhak.github.io/LATTEArena.
Abstract（参考訳）: 機能エンジニアリングは表形式のデータ分析の基礎のままであり、Large Language Models (LLMs) は自動化のための有望なパラダイムとして登場し、LLMを動力とするAutomated Tabular Feature Engineering (LATTE) が誕生した。しかし、この分野は標準化され、コストを意識した評価プラットフォームが欠如しており、設計選択の組合せ的爆発は真のアルゴリズムの進歩を曖昧にしている。これらのギャップを埋めるため、15の代表的なLATTE法を6次元の統一分類に体系的に分解した。この抽象化に基づいて,モノリシックパイプラインを再利用可能な実行ブロックに分離する,標準化されたモジュール化された拡張可能なベンチマークフレームワークであるLATTEArenaを紹介する。大規模な組み合わせ空間を蒸留することにより、7つの質問に対して24コアのLATTE構成を評価した。我々の頭から頭へのベンチマークは、トークンの効率と実行の堅牢性を定量化するための予測精度以上のもので、コスト効率のトレードオフに関する17の実証的な結果をもたらします。さらに、最適な実世界展開のための3つの具体的なレコメンデーションを提供します。制御されたコンポーネントレベルの比較を可能にすることで、LATTEArenaはパラダイムをアドホックなプロンプトエンジニアリングから、システマティックなコンテキスト管理へとシフトする。すべてのコード、データセット、4000以上の実行ログは、動的でコミュニティ主導のベンチマークを育むために公開されています。私たちのフレームワーク、リーダボード、およびすべてのアーティファクトは、LATTEArenaプロジェクトのWebサイトでhttps://goodenhak.github.io/LATTEArena.comでホストされています。

論文の概要: LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

関連論文リスト