Fugu-MT 論文翻訳(概要): ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning

論文の概要: ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning

arxiv url: http://arxiv.org/abs/2603.14887v2
Date: Wed, 18 Mar 2026 03:06:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-21 18:33:56.848854
Title: ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning
Title（参考訳）: ViSA: 汎用ゴールスペースコントラスト強化学習のための訪問状態強化
Authors: Issa Nakamura, Tomoya Yamanokuchi, Yuki Kadokawa, Jia Qu, Shun Otsub, Ken Miyamoto, Shotaro Miwa, Takamitsu Matsubara,
Abstract要約: ViSA(Visited-State Augmentation)と呼ばれるコントラスト強化学習(CRL)のための新しいデータ強化手法を提案する。 ViSA は,1) 高度化状態サンプルの生成,2) 連続的な埋め込み空間の学習,という2つのコンポーネントから構成される。目標空間の一般化が向上し,視認困難な目標に対する正確な値推定が可能となった。
参考スコア（独自算出の注目度）: 3.554237279000473
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Goal-Conditioned Reinforcement Learning (GCRL) is a framework for learning a policy that can reach arbitrarily given goals. In particular, Contrastive Reinforcement Learning (CRL) provides a framework for policy updates using an approximation of the value function estimated via contrastive learning, achieving higher sample efficiency compared to conventional methods. However, since CRL treats the visited state as a pseudo-goal during learning, it can accurately estimate the value function only for limited goals. To address this issue, we propose a novel data augmentation approach for CRL called ViSA (Visited-State Augmentation). ViSA consists of two components: 1) generating augmented state samples, with the aim of augmenting hard-to-visit state samples during on-policy exploration, and 2) learning consistent embedding space, which uses an augmented state as auxiliary information to regularize the embedding space by reformulating the objective function of the embedding space based on mutual information. We evaluate ViSA in simulation and real-world robotic tasks and show improved goal-space generalization, which permits accurate value estimation for hard-to-visit goals. Further details can be found on the project page: https://issa-n.github.io/projectPage_ViSA/
Abstract（参考訳）: GCRL(Goal-Conditioned Reinforcement Learning)は、任意の目標を達成するための政策を学ぶためのフレームワークである。特に、Contrastive Reinforcement Learning (CRL)は、対照的な学習によって推定される値関数の近似を用いてポリシー更新を行うためのフレームワークを提供する。しかし、CRLは、訪れた状態を学習中に擬似ゴールとして扱うため、限られた目標に対してのみ正確に値関数を推定することができる。そこで本研究では,ViSA(Visited-State Augmentation)と呼ばれるCRLのための新しいデータ拡張手法を提案する。 ViSAは2つのコンポーネントから構成される。 1)増設状態試料の作成 : 現地調査の際, 訪日状態試料の増設をめざして 2) 組込み空間の目的関数を相互情報に基づいて再構成することにより、補助情報として拡張状態を用いて組込み空間を規則化する一貫した組込み空間を学習する。シミュレーションおよび実世界のロボット作業におけるViSAの評価を行い,目標空間の一般化を改良し,視認困難な目標に対する正確な値推定を可能にした。詳細はプロジェクトのページにある。 https://issa-n.github.io/projectPage_ViSA/

論文の概要: ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning

関連論文リスト