Decomposed Mutual Information Optimization for Generalized Context in
Meta-Reinforcement Learning
- URL: http://arxiv.org/abs/2210.04209v1
- Date: Sun, 9 Oct 2022 09:44:23 GMT
- Title: Decomposed Mutual Information Optimization for Generalized Context in
Meta-Reinforcement Learning
- Authors: Yao Mu, Yuzheng Zhuang, Fei Ni, Bin Wang, Jianyu Chen, Jianye Hao,
Ping Luo
- Abstract summary: Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making.
This paper addresses such a challenge by Decomposed Mutual INformation Optimization (DOMINO) for context learning.
Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by multi-confounded challenges.
- Score: 35.87062321504049
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapting to the changes in transition dynamics is essential in robotic
applications. By learning a conditional policy with a compact context,
context-aware meta-reinforcement learning provides a flexible way to adjust
behavior according to dynamics changes. However, in real-world applications,
the agent may encounter complex dynamics changes. Multiple confounders can
influence the transition dynamics, making it challenging to infer accurate
context for decision-making. This paper addresses such a challenge by
Decomposed Mutual INformation Optimization (DOMINO) for context learning, which
explicitly learns a disentangled context to maximize the mutual information
between the context and historical trajectories, while minimizing the state
transition prediction error. Our theoretical analysis shows that DOMINO can
overcome the underestimation of the mutual information caused by
multi-confounded challenges via learning disentangled context and reduce the
demand for the number of samples collected in various environments. Extensive
experiments show that the context learned by DOMINO benefits both model-based
and model-free reinforcement learning algorithms for dynamics generalization in
terms of sample efficiency and performance in unseen environments.
Related papers
- Exploring Contextual Flux in Large Language Models: A Novel Approach to Self-Modulating Semantic Networks [0.0]
Self-modulating mechanisms introduce dynamic adaptation capabilities within language models.
contextual realignment strategies influence token embedding trajectories across extended sequences.
Self-regulation enhances text generation consistency while preserving generative flexibility.
Findings suggest that while adaptive embedding updates improve certain aspects of coherence, their impact remains contingent on model capacity and input complexity.
arXiv Detail & Related papers (2025-02-16T01:08:19Z) - Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance.
We introduce novel algorithms for dynamic, instance-level data reweighting.
Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z) - On-the-fly Modulation for Balanced Multimodal Learning [53.616094855778954]
Multimodal learning is expected to boost model performance by integrating information from different modalities.
The widely-used joint training strategy leads to imbalanced and under-optimized uni-modal representations.
We propose On-the-fly Prediction Modulation (OPM) and On-the-fly Gradient Modulation (OGM) strategies to modulate the optimization of each modality.
arXiv Detail & Related papers (2024-10-15T13:15:50Z) - Text-centric Alignment for Multi-Modality Learning [3.6961400222746748]
We propose the Text-centric Alignment for Multi-Modality Learning (TAMML) approach.
By leveraging the unique properties of text as a unified semantic space, TAMML demonstrates significant improvements in handling unseen, diverse, and unpredictable modality combinations.
This study contributes to the field by offering a flexible, effective solution for real-world applications where modality availability is dynamic and uncertain.
arXiv Detail & Related papers (2024-02-12T22:07:43Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - Meta-learning using privileged information for dynamics [66.32254395574994]
We extend the Neural ODE Process model to use additional information within the Learning Using Privileged Information setting.
We validate our extension with experiments showing improved accuracy and calibration on simulated dynamics tasks.
arXiv Detail & Related papers (2021-04-29T12:18:02Z) - Context-aware Dynamics Model for Generalization in Model-Based
Reinforcement Learning [124.9856253431878]
We decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it.
In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics.
The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.
arXiv Detail & Related papers (2020-05-14T08:10:54Z) - Contextual Policy Transfer in Reinforcement Learning Domains via Deep
Mixtures-of-Experts [24.489002406693128]
We introduce a novel mixture-of-experts formulation for learning state-dependent beliefs over source task dynamics.
We show how this model can be incorporated into standard policy reuse frameworks.
arXiv Detail & Related papers (2020-02-29T07:58:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.