Fugu-MT 論文翻訳(概要): VORTEX: Aligning Task Utility and Human Preferences through LLM-Guided Reward Shaping

論文の概要: VORTEX: Aligning Task Utility and Human Preferences through LLM-Guided Reward Shaping

arxiv url: http://arxiv.org/abs/2509.16399v1
Date: Fri, 19 Sep 2025 20:22:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-23 18:58:15.774014
Title: VORTEX: Aligning Task Utility and Human Preferences through LLM-Guided Reward Shaping
Title（参考訳）: VORTEX: LLMガイドによるリワードシェイピングによるタスクユーティリティと人間の嗜好の調整
Authors: Guojun Xiong, Milind Tambe,
Abstract要約: 社会的影響の最適化において、AI決定システムはよく、よく校正された数学的目的を最適化する解法に依存する。近年のアプローチでは、好み記述から新たな報酬関数を生成するために、大規模言語モデルを用いてこの問題に対処している。我々は、人間のフィードバックを適応的に取り入れつつ、確立された最適化目標を保存する言語誘導報酬形成フレームワークである textttVORTEX を提案する。
参考スコア（独自算出の注目度）: 40.48402462300208
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In social impact optimization, AI decision systems often rely on solvers that optimize well-calibrated mathematical objectives. However, these solvers cannot directly accommodate evolving human preferences, typically expressed in natural language rather than formal constraints. Recent approaches address this by using large language models (LLMs) to generate new reward functions from preference descriptions. While flexible, they risk sacrificing the system's core utility guarantees. In this paper, we propose \texttt{VORTEX}, a language-guided reward shaping framework that preserves established optimization goals while adaptively incorporating human feedback. By formalizing the problem as multi-objective optimization, we use LLMs to iteratively generate shaping rewards based on verbal reinforcement and text-gradient prompt updates. This allows stakeholders to steer decision behavior via natural language without modifying solvers or specifying trade-off weights. We provide theoretical guarantees that \texttt{VORTEX} converges to Pareto-optimal trade-offs between utility and preference satisfaction. Empirical results in real-world allocation tasks demonstrate that \texttt{VORTEX} outperforms baselines in satisfying human-aligned coverage goals while maintaining high task performance. This work introduces a practical and theoretically grounded paradigm for human-AI collaborative optimization guided by natural language.
Abstract（参考訳）: 社会的影響の最適化において、AI決定システムはよく、よく校正された数学的目的を最適化する解法に依存する。しかし、これらの解法は人間の嗜好の進化に直接適応できない(典型的には形式的な制約ではなく自然言語で表される)。近年,大規模言語モデル (LLM) を用いて好み記述から新たな報酬関数を生成する手法が提案されている。柔軟性はあるものの、システムのコアユーティリティ保証を犠牲にするリスクがある。本稿では,人間のフィードバックを適応的に取り入れつつ,確立した最適化目標を保存するための言語誘導報酬形成フレームワークである「texttt{VORTEX}」を提案する。この問題を多目的最適化として定式化することにより、言語強化とテキスト段階のプロンプト更新に基づいて、LCMを反復的にシェーピング報酬を生成する。これにより、ステークホルダーは、ソルバを変更したり、トレードオフの重みを指定したりすることなく、自然言語を介して意思決定の行動を制御できる。実用性と嗜好満足度の間のパレート最適トレードオフに,‘texttt{VORTEX} が収束するという理論的保証を提供する。実世界のアロケーションタスクにおける実証的な結果から,高タスク性能を維持しつつ,人間のアラインなカバレッジ目標を満たす上で,‘texttt{VORTEX} がベースラインを上回っていることが示された。この研究は、自然言語で導かれる人間とAIの協調最適化のための実践的で理論的に基礎的なパラダイムを導入している。

論文の概要: VORTEX: Aligning Task Utility and Human Preferences through LLM-Guided Reward Shaping

関連論文リスト