Fugu-MT 論文翻訳(概要): Babbling Suppression: Making LLMs Greener One Token at a Time

論文の概要: Babbling Suppression: Making LLMs Greener One Token at a Time

arxiv url: http://arxiv.org/abs/2604.06755v1
Date: Wed, 08 Apr 2026 07:21:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.39208
Title: Babbling Suppression: Making LLMs Greener One Token at a Time
Title（参考訳）: LLMを1トンずつ温める「Babbling Suppression」
Authors: Lola Solovyeva, Fernando Castor,
Abstract要約: 大規模言語モデル(LLM)は、現代のソフトウェア開発でますます使われている。 LLMは「バブリング(babbling)」と呼ばれ、さらなる認知、経済、エネルギーコストを発生させる。本研究は, 解の精度を損なうことなく, 不要な出力を削減するための実用的, モデルに依存しない手法を提案する。
参考スコア（独自算出の注目度）: 46.879983975894135
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Context: Large Language Models (LLMs) are increasingly used in modern software development, aiding in code generation, code completion, and refactoring through AI-powered assistants. While they accelerate development workflows, they often produce extraneous output, referred to as "babbling", which incurs additional cognitive, economic, and energy costs. Objective: This work investigates the problem of babbling in LLM-based code generation and proposes a practical, model-agnostic approach to reduce unnecessary output without compromising solution accuracy. Method: We introduce Babbling Suppression (BS), a method that integrates test execution into the LLM generation process by evaluating intermediate outputs and terminating generation once a solution passes all tests. This prevents excessive token generation while having no impact on model accuracy. An empirical study was conducted across two Python and two Java benchmarks, targeting four 3-4B parameter models and six 6-7B parameter models. Results: Our findings show that babbling occurs across all tested models, with higher frequency in Java than in Python. Applying BS significantly reduces energy consumption by up to 65% for Python and 62% for Java in models prone to babbling. Across 40 model-benchmark pairs, 29 showed reduced mean energy consumption, with reductions exceeding 20% in 22 cases. Generated token count decreased in 35 pairs, while the GPU energy-per-token overhead of BS remained below 10% for 26 pairs, decreased for 2, and reached a maximum of 24%, yielding net energy savings in most cases. Implications: BS can make AI-assisted programming more efficient and sustainable by reducing energy consumption and minimizing cognitive effort by developers. Its model-agnostic design allows easy integration, suggesting broad applicability.
Abstract（参考訳）: コンテキスト: 大規模言語モデル(LLM)は、コード生成、コード補完、AIによるアシスタントによるリファクタリングなど、現代のソフトウェア開発でますます使われています。開発ワークフローを加速する一方で、彼らはしばしば「バブリング」と呼ばれる余分なアウトプットを生成し、それによって認知、経済、エネルギーコストが増大する。目的: 本研究は, LLMベースのコード生成におけるバブリングの問題を調査し, 解の精度を損なうことなく, 不要な出力を削減するための実用的, モデルに依存しないアプローチを提案する。方法: テスト実行をLCM生成プロセスに統合する手法であるBabbling Suppression(BS)を導入する。これにより、モデル精度に影響を与えることなく、過剰なトークンの生成を防止する。 2つのPythonと2つのJavaベンチマークで実証的研究が行われ、4つの3-4Bパラメータモデルと6つの6-7Bパラメータモデルをターゲットにした。結果: この結果から,PythonよりもJavaの方が高い頻度で,すべてのテストモデルでバブリングが発生していることがわかった。 BSを適用することで、Pythonでは最大65%、Javaでは62%のエネルギー消費が大幅に削減される。モデルベンチマークペア40種中29種は平均エネルギー消費量が減少し,22例で20%以上減少した。生成トークン数は35対で減少し、BSのGPUエネルギー当たりのオーバーヘッドは26対で10%以下であり、2対で減少し、最大で24%まで減少し、ほとんどの場合、純エネルギーを節約した。意味:BSは、エネルギー消費を減らし、開発者の認知力を最小化することによって、AI支援プログラミングをより効率的かつ持続可能なものにすることができる。そのモデルに依存しない設計は容易に統合でき、幅広い適用性を示している。

論文の概要: Babbling Suppression: Making LLMs Greener One Token at a Time

関連論文リスト