Fugu-MT 論文翻訳(概要): TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering

論文の概要: TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering

arxiv url: http://arxiv.org/abs/2605.05980v1
Date: Thu, 07 May 2026 10:24:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.699005
Title: TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering
Title（参考訳）: TACT: アクティベーションステアリングによるコーディングエージェントのオーバーライドとオーバーアクティベーションの軽減
Authors: Yuan Sui, Yulin Chen, Yibo Li, Xue Jiang, Yufei He, Yihong Dong, Xiaoxin He, Tianyu Gao, Bryan Hooi,
Abstract要約: 我々は、エージェントが既に持っている情報に対して繰り返し理由付けを行う2つの障害モードと、最近の観察を統合したり、新たな証拠を取得することなくツールコールを発行する2つの障害モードに焦点を当てる。本稿では,活性化ステアリングによるTACT (Think-Act via activation Steering) を導入し,動作不良として現れる前に残留流中のエージェントの漂流を検知・緩和する。具体的には、軌道のステップを過度に考え、過剰に実行し、あるいは校正し、隠れた状態が2つの *drift 軸* に沿って線形に分離できることを発見し、それぞれの障害モードに向かって校正された振る舞いを指示する。
参考スコア（独自算出の注目度）: 70.99933391739154
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: When language model agents tackle complex software engineering tasks, they often degrade over long trajectories, which we define as *agent drift*. We focus on two recurring failure modes *overthinking* and *overacting*, i.e., where the agent repeatedly reasons over information it already has, and where it issues tool calls without integrating recent observations or acquiring new evidence. In this paper, we introduce TACT (Think-Act Calibration via activation Steering), to detect and mitigate agent drift in the residual stream before it surfaces as a behavioral failure. In specific, we label trajectory steps as overthinking, overacting, or calibrated, and find that their hidden states can separate linearly along two *drift axes*, pointing from calibrated behavior toward each failure mode (AUC $\approx$ 0.9). To mitigate agent drift, we project each step's activation onto these axes at test time and pull drifted ones back toward the calibrated region. Experiments show that TACT outperforms unsteered baselines across SWE-bench Verified, Terminal-Bench 2.0, and CLAW-Eval, lifting average resolve rate by $+5.8$ pp on Qwen3.5-27B and $+4.8$ pp on Gemma-4-26B-A4B-it while cutting steps-to-resolve by up to $26\%$. These gains frame agent drift as a steerable direction in the residual stream, and position TACT as a viable handle for reliable long-horizon agents.
Abstract（参考訳）: 言語モデルエージェントが複雑なソフトウェアエンジニアリングタスクに取り組む場合、長い軌道上で分解されることが多い。我々は,2つの繰り返し発生する障害モード,*過剰思考*と*過剰行為*,すなわちエージェントがすでに持っている情報に対して繰り返し理由付けを行い,最近の観察を統合することなくツールコールを発行し,新たな証拠を取得することに集中する。本稿では,アクティベーションステアリングによるTACT(Think-Act Calibration via activation Steering)を導入し,動作不良として現れる前に残留流中のエージェントの漂流を検知・緩和する。具体的には、軌道のステップを過度に考え、過剰に実行し、あるいは校正し、隠れた状態が2つの *drift 軸* に沿って線形に分離できることを発見し、それぞれの障害モード(AUC $\approx$ 0.9)に対して校正された振る舞いを指示する。エージェントドリフトを緩和するため、各ステップのアクティベーションを試験時にこれらの軸に投射し、ドリフトドリフトを校正領域へ引き戻す。実験の結果、TACT は SWE-bench Verified 、 Terminal-Bench 2.0 、 CLAW-Eval で非ステアリングベースラインを上回り、Qwen3.5-27B では$+5.8$ pp 、Gemma-4-26B-A4B-it では$+4.8$ pp となり、ステップ・トゥ・リゾリュートを最大2,6\% 削減した。これにより、フレームエージェントは残留流の操舵可能な方向としてドリフトし、TACTは信頼性の高いロングホライゾンエージェントの実行可能なハンドルとして位置づけられる。

論文の概要: TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering

関連論文リスト