DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
- URL: http://arxiv.org/abs/2405.14790v1
- Date: Thu, 23 May 2024 17:00:15 GMT
- Title: DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation
- Authors: Jinxin Liu, Xinghong Guo, Zifeng Zhuang, Donglin Wang,
- Abstract summary: We propose a novel approach called DIffusion-guided DIversity (DIDI) for offline behavioral generation.
The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data.
- Score: 25.904918670006587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel approach called DIffusion-guided DIversity (DIDI) for offline behavioral generation. The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data. We achieve this by leveraging diffusion probabilistic models as priors to guide the learning process and regularize the policy. By optimizing a joint objective that incorporates diversity and diffusion-guided regularization, we encourage the emergence of diverse behaviors while maintaining the similarity to the offline data. Experimental results in four decision-making domains (Push, Kitchen, Humanoid, and D4RL tasks) show that DIDI is effective in discovering diverse and discriminative skills. We also introduce skill stitching and skill interpolation, which highlight the generalist nature of the learned skill space. Further, by incorporating an extrinsic reward function, DIDI enables reward-guided behavior generation, facilitating the learning of diverse and optimal behaviors from sub-optimal data.
Related papers
- Robust Policy Learning via Offline Skill Diffusion [6.876580618014666]
We present a novel offline skill learning framework, DuSkill.
DuSkill employs a guided Diffusion model to generate versatile skills extended from the limited skills in datasets.
We show that DuSkill outperforms other skill-based imitation learning and RL algorithms for several long-horizon tasks.
arXiv Detail & Related papers (2024-03-01T02:00:44Z) - Offline Diversity Maximization Under Imitation Constraints [23.761620064055897]
We propose a principled offline algorithm for unsupervised skill discovery.
Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery.
We demonstrate the effectiveness of our method on the standard offline benchmark D4RL.
arXiv Detail & Related papers (2023-07-21T06:12:39Z) - Generalizable Low-Resource Activity Recognition with Diverse and
Discriminative Representation Learning [24.36351102003414]
Human activity recognition (HAR) is a time series classification task that focuses on identifying the motion patterns from human sensor readings.
We propose a novel approach called Diverse and Discriminative representation Learning (DDLearn) for generalizable lowresource HAR.
Our method significantly outperforms state-of-art methods by an average accuracy improvement of 9.5%.
arXiv Detail & Related papers (2023-05-25T08:24:22Z) - System Neural Diversity: Measuring Behavioral Heterogeneity in Multi-Agent Learning [8.280943341629161]
We introduce System Neural Diversity (SND): a measure of behavioral heterogeneity in multi-agent systems.
We show how SND allows us to measure latent resilience skills acquired by the agents, while other proxies, such as task performance (reward), fail.
We demonstrate how this paradigm can be used to bootstrap the exploration phase, finding optimal policies faster.
arXiv Detail & Related papers (2023-05-03T13:58:13Z) - Source-free Domain Adaptation Requires Penalized Diversity [60.04618512479438]
Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data.
In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor.
We propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors.
arXiv Detail & Related papers (2023-04-06T00:20:19Z) - Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation [51.21190751266442]
Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data.
By learning from large-scale unlabeled samples, self-supervised learning has now become a new trend in deep learning.
We propose a novel textbfSelf-textbfSupervised textbfGraph Neural Network (SSG) to enable more effective inter-task information exchange and knowledge sharing.
arXiv Detail & Related papers (2022-04-08T03:37:56Z) - Latent-Variable Advantage-Weighted Policy Optimization for Offline RL [70.01851346635637]
offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions.
In practice, offline datasets are often heterogeneous, i.e., collected in a variety of scenarios.
We propose to leverage latent-variable policies that can represent a broader class of policy distributions.
Our method improves the average performance of the next best-performing offline reinforcement learning methods by 49% on heterogeneous datasets.
arXiv Detail & Related papers (2022-03-16T21:17:03Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z) - Effective Diversity in Population Based Reinforcement Learning [38.62641968788987]
We introduce an approach to optimize all members of a population simultaneously.
Rather than using pairwise distance, we measure the volume of the entire population in a behavioral manifold.
Our algorithm Diversity via Determinants (DvD) adapts the degree of diversity during training using online learning techniques.
arXiv Detail & Related papers (2020-02-03T10:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.