Fugu-MT 論文翻訳(概要): Can AI Make Energy Retrofit Decisions? An Evaluation of Large Language Models

論文の概要: Can AI Make Energy Retrofit Decisions? An Evaluation of Large Language Models

arxiv url: http://arxiv.org/abs/2509.06307v1
Date: Mon, 08 Sep 2025 03:13:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-09 14:07:03.954999
Title: Can AI Make Energy Retrofit Decisions? An Evaluation of Large Language Models
Title（参考訳）: AIは省エネ決定を下せるか? : 大規模言語モデルの評価
Authors: Lei Shu, Dong Zhao,
Abstract要約: 生成AI、特に大きな言語モデル(LLM)は、コンテキスト情報処理と実践的な読みやすいレコメンデーションの生成を支援する。我々は,CO2削減の最大化(技術)と返済期間の最小化(社会技術)の2つの目的の下で,住宅の適合性決定に関する7つのLCMを評価した。 LLMは、多くのケースで効果的なレコメンデーションを生成し、トップ1マッチの最大54.5パーセントと、微調整なしでトップ5内92.8%に達する。
参考スコア（独自算出の注目度）: 6.392935342375115
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conventional approaches to building energy retrofit decision making suffer from limited generalizability and low interpretability, hindering adoption in diverse residential contexts. With the growth of Smart and Connected Communities, generative AI, especially large language models (LLMs), may help by processing contextual information and producing practitioner readable recommendations. We evaluate seven LLMs (ChatGPT, DeepSeek, Gemini, Grok, Llama, and Claude) on residential retrofit decisions under two objectives: maximizing CO2 reduction (technical) and minimizing payback period (sociotechnical). Performance is assessed on four dimensions: accuracy, consistency, sensitivity, and reasoning, using a dataset of 400 homes across 49 US states. LLMs generate effective recommendations in many cases, reaching up to 54.5 percent top 1 match and 92.8 percent within top 5 without fine tuning. Performance is stronger for the technical objective, while sociotechnical decisions are limited by economic trade offs and local context. Agreement across models is low, and higher performing models tend to diverge from others. LLMs are sensitive to location and building geometry but less sensitive to technology and occupant behavior. Most models show step by step, engineering style reasoning, but it is often simplified and lacks deeper contextual awareness. Overall, LLMs are promising assistants for energy retrofit decision making, but improvements in accuracy, consistency, and context handling are needed for reliable practice.
Abstract（参考訳）: エネルギーの再適合意思決定への従来のアプローチは、限定的な一般化性と低い解釈性に悩まされ、多様な住宅環境における採用を妨げる。スマートコミュニティとコネクテッドコミュニティの成長により、生成AI、特に大きな言語モデル(LLM)は、コンテキスト情報処理や実践的なレコメンデーションの生成に役立ちます。我々は,CO2削減の最大化(技術)と返済期間の最小化(社会技術)という2つの目的のもと,住宅の適合性決定の7つのLCM(ChatGPT,DeepSeek,Gemini,Grok,Llama,Claude)を評価した。精度、一貫性、感度、推論の4つの次元で、49州にまたがる400世帯のデータセットを使用してパフォーマンスを評価する。 LLMは、多くのケースで効果的なレコメンデーションを生成し、トップ1マッチの最大54.5パーセントと、微調整なしでトップ5内92.8%に達する。技術的目的のためにはパフォーマンスが強く、社会技術的決定は経済的なトレードオフや地域的な文脈によって制限される。モデル間の合意は低く、より高いパフォーマンスのモデルは他のモデルと異なる傾向があります。 LLMは位置や構造に敏感だが、技術や占有行動には敏感ではない。ほとんどのモデルは、ステップバイステップ、エンジニアリングスタイルの推論を示していますが、しばしば単純化され、より深いコンテキスト意識が欠如しています。全体として、LLMは、エネルギーの再適合決定のための有望なアシスタントであるが、信頼性、一貫性、コンテキストハンドリングの改善は、信頼性の高い実践のために必要である。

論文の概要: Can AI Make Energy Retrofit Decisions? An Evaluation of Large Language Models

関連論文リスト