Fugu-MT 論文翻訳(概要): Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

論文の概要: Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

arxiv url: http://arxiv.org/abs/2402.13777v4
Date: Mon, 26 Feb 2024 03:11:41 GMT
ステータス: 翻訳完了
システム内更新日: 2024-02-27 18:18:46.507814
Title: Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
Title（参考訳）: オフライン政策学習のための深層生成モデル--チュートリアル,調査,今後の方向性の展望
Authors: Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet Aggarwal
Abstract要約: オフライン政策学習における深層生成モデルの適用に関する最初の体系的なレビューを提供する。本稿では、変分オートエンコーダ、生成逆数ネットワーク、正規化フロー、トランスフォーマー、拡散モデルを含む5つの主流の深部生成モデルについて述べる。各タイプのDGMに基づくオフライン政策学習において、基本スキームを抽出し、DGMの使用状況に基づいて関連作業の分類を行い、アルゴリズムの開発プロセスを整理する。
参考スコア（独自算出の注目度）: 42.91411985417056
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. Thus, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Specifically, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.
Abstract（参考訳）: deep generative models(dgms)は、オフラインデータからトレーニングされたモデルを使用してテキスト、画像、ビデオを生成することで、さまざまなドメインで大きな成功を収めています。同様に、データ駆動意思決定とロボット制御は、オフラインデータからジェネレータ関数を学習し、戦略やポリシーとして機能する必要がある。この場合、オフライン政策学習に深い生成モデルを適用することは大きな可能性を示し、この方向に多くの研究がなされている。しかし、この分野には包括的なレビューがないため、異なるブランチの開発は比較的独立している。そこで本研究では,オフラインポリシ学習における深層生成モデルの応用について,初めて体系的なレビューを行う。特に, 変分自動エンコーダ, 生成適応ネットワーク, 正規化フロー, トランスフォーマー, 拡散モデル, オフライン強化学習(オフラインRL) と模倣学習(IL)の5つの主要な深層生成モデルについて述べる。オフラインRLとILは、オフラインポリシー学習の2つの主要な分野であり、シーケンシャルな意思決定のための広く採用されている技術である。具体的には、DGMをベースとしたオフライン政策学習において、基本スキームを精算し、DGMの使用状況に基づいて関連研究を分類し、その分野におけるアルゴリズムの開発プロセスを整理する。そこで本研究では,本研究では,本研究の今後の方向性を概観した,深層生成モデルとオフライン政策学習に関する詳細な議論を要約として提示する。この研究は、オフラインポリシー学習のための深い生成モデルの研究の進展をハンズオンで参照し、改良されたDGMベースのオフラインRLまたはILアルゴリズムを刺激することを目的としている。便利のために、私たちはhttps://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learningのペーパーリストを保持します。

論文の概要: Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

関連論文リスト