Meta Generative Flow Networks with Personalization for Task-Specific
Adaptation
- URL: http://arxiv.org/abs/2306.09742v1
- Date: Fri, 16 Jun 2023 10:18:38 GMT
- Title: Meta Generative Flow Networks with Personalization for Task-Specific
Adaptation
- Authors: Xinyuan Ji, Xu Zhang, Wei Xi, Haozhi Wang, Olga Gadyatskaya, Yinchuan
Li
- Abstract summary: Multi-task reinforcement learning and meta-reinforcement learning tend to focus on tasks with higher rewards and more frequent occurrences.
GFlowNets can be integrated into meta-learning algorithms (GFlowMeta) by leveraging the advantages of GFlowNets on tasks with sparse rewards.
This paper proposes a personalized approach named pGFlowMeta, which combines task-specific personalized policies with a meta policy.
- Score: 8.830531142309733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task reinforcement learning and meta-reinforcement learning have been
developed to quickly adapt to new tasks, but they tend to focus on tasks with
higher rewards and more frequent occurrences, leading to poor performance on
tasks with sparse rewards. To address this issue, GFlowNets can be integrated
into meta-learning algorithms (GFlowMeta) by leveraging the advantages of
GFlowNets on tasks with sparse rewards. However, GFlowMeta suffers from
performance degradation when encountering heterogeneous transitions from
distinct tasks. To overcome this challenge, this paper proposes a personalized
approach named pGFlowMeta, which combines task-specific personalized policies
with a meta policy. Each personalized policy balances the loss on its
personalized task and the difference from the meta policy, while the meta
policy aims to minimize the average loss of all tasks. The theoretical analysis
shows that the algorithm converges at a sublinear rate. Extensive experiments
demonstrate that the proposed algorithm outperforms state-of-the-art
reinforcement learning algorithms in discrete environments.
Related papers
- Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization [4.158255103170876]
GFlowNets are a family of generative models that learn to sample objects proportional to a given reward function.
Recent results show a close relationship between GFlowNet training and entropy-regularized reinforcement learning problems.
We introduce a simple backward policy optimization algorithm that involves direct sequentially of the value function in an entropy-regularized Markov Decision Process.
arXiv Detail & Related papers (2024-10-20T19:12:14Z) - Distributional GFlowNets with Quantile Flows [73.73721901056662]
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a policy for generating complex structure through a series of decision-making steps.
In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training.
Our proposed textitquantile matching GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty.
arXiv Detail & Related papers (2023-02-11T22:06:17Z) - On the Convergence Theory of Meta Reinforcement Learning with
Personalized Policies [26.225293232912716]
This paper proposes a novel personalized Meta-RL (pMeta-RL) algorithm.
It aggregates task-specific personalized policies to update a meta-policy used for all tasks, while maintaining personalized policies to maximize the average return of each task.
Experiment results show that the proposed algorithms outperform other previous Meta-RL algorithms on Gym and MuJoCo suites.
arXiv Detail & Related papers (2022-09-21T02:27:56Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - Learning to generate imaginary tasks for improving generalization in
meta-learning [12.635773307074022]
The success of meta-learning on existing benchmarks is predicated on the assumption that the distribution of meta-training tasks covers meta-testing tasks.
Recent solutions have pursued augmentation of meta-training tasks, while it is still an open question to generate both correct and sufficiently imaginary tasks.
In this paper, we seek an approach that up-samples meta-training tasks from the task representation via a task up-sampling network. Besides, the resulting approach named Adversarial Task Up-sampling (ATU) suffices to generate tasks that can maximally contribute to the latest meta-learner by maximizing an adversarial loss.
arXiv Detail & Related papers (2022-06-09T08:21:05Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Meta-learning with an Adaptive Task Scheduler [93.63502984214918]
Existing meta-learning algorithms randomly sample meta-training tasks with a uniform probability.
It is likely that tasks are detrimental with noise or imbalanced given a limited number of meta-training tasks.
We propose an adaptive task scheduler (ATS) for the meta-training process.
arXiv Detail & Related papers (2021-10-26T22:16:35Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z) - Curriculum in Gradient-Based Meta-Reinforcement Learning [10.447238563837173]
We show that gradient-based meta-learners are sensitive to task distributions.
With the wrong curriculum, agents suffer the effects of meta-overfitting, shallow adaptation, and adaptation instability.
arXiv Detail & Related papers (2020-02-19T01:40:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.