Self-Sustaining Multiple Access with Continual Deep Reinforcement
Learning for Dynamic Metaverse Applications
- URL: http://arxiv.org/abs/2309.10177v1
- Date: Mon, 18 Sep 2023 22:02:47 GMT
- Title: Self-Sustaining Multiple Access with Continual Deep Reinforcement
Learning for Dynamic Metaverse Applications
- Authors: Hamidreza Mazandarani, Masoud Shokrnezhad, Tarik Taleb, and Richard Li
- Abstract summary: The Metaverse is a new paradigm that aims to create a virtual environment consisting of numerous worlds, each of which will offer a different set of services.
To deal with such a dynamic and complex scenario, one potential approach is to adopt self-sustaining strategies.
This paper investigates the problem of multiple access in multi-channel environments to maximize the throughput of the intelligent agent.
- Score: 17.436875530809946
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Metaverse is a new paradigm that aims to create a virtual environment
consisting of numerous worlds, each of which will offer a different set of
services. To deal with such a dynamic and complex scenario, considering the
stringent quality of service requirements aimed at the 6th generation of
communication systems (6G), one potential approach is to adopt self-sustaining
strategies, which can be realized by employing Adaptive Artificial Intelligence
(Adaptive AI) where models are continually re-trained with new data and
conditions. One aspect of self-sustainability is the management of multiple
access to the frequency spectrum. Although several innovative methods have been
proposed to address this challenge, mostly using Deep Reinforcement Learning
(DRL), the problem of adapting agents to a non-stationary environment has not
yet been precisely addressed. This paper fills in the gap in the current
literature by investigating the problem of multiple access in multi-channel
environments to maximize the throughput of the intelligent agent when the
number of active User Equipments (UEs) may fluctuate over time. To solve the
problem, a Double Deep Q-Learning (DDQL) technique empowered by Continual
Learning (CL) is proposed to overcome the non-stationary situation, while the
environment is unknown. Numerical simulations demonstrate that, compared to
other well-known methods, the CL-DDQL algorithm achieves significantly higher
throughputs with a considerably shorter convergence time in highly dynamic
scenarios.
Related papers
- A Semantic-Aware Multiple Access Scheme for Distributed, Dynamic 6G-Based Applications [14.51946231794179]
This paper introduces a novel formulation for the problem of multiple access to the wireless spectrum.
It aims to optimize the utilization-fairness trade-off, using the $alpha$-fairness metric.
A Semantic-Aware Multi-Agent Double and Dueling Deep Q-Learning (SAMA-D3QL) technique is proposed.
arXiv Detail & Related papers (2024-01-12T00:32:38Z) - Constrained Meta-Reinforcement Learning for Adaptable Safety Guarantee
with Differentiable Convex Programming [4.825619788907192]
This paper studies the unique challenges of ensuring safety in non-stationary environments by solving constrained problems through the lens of the meta-learning approach (learning-to-learn)
We first employ successive convex-constrained policy updates across multiple tasks with differentiable convexprogramming, which allows meta-learning in constrained scenarios by enabling end-to-end differentiation.
arXiv Detail & Related papers (2023-12-15T21:55:43Z) - Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with
Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability.
We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments.
We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z) - Addressing the issue of stochastic environments and local
decision-making in multi-objective reinforcement learning [0.0]
Multi-objective reinforcement learning (MORL) is a relatively new field which builds on conventional Reinforcement Learning (RL)
This thesis focuses on what factors influence the frequency with which value-based MORL Q-learning algorithms learn the optimal policy for an environment.
arXiv Detail & Related papers (2022-11-16T04:56:42Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - IQ-Learn: Inverse soft-Q Learning for Imitation [95.06031307730245]
imitation learning from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics.
Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence.
We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function.
arXiv Detail & Related papers (2021-06-23T03:43:10Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Learning to Continuously Optimize Wireless Resource In Episodically
Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment.
We propose to build the notion of continual learning into the modeling process of learning wireless systems.
Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z) - Simultaneously Evolving Deep Reinforcement Learning Models using
Multifactorial Optimization [18.703421169342796]
This work proposes a framework capable of simultaneously evolving several DQL models towards solving interrelated Reinforcement Learning tasks.
A thorough experimentation is presented and discussed so as to assess the performance of the framework.
arXiv Detail & Related papers (2020-02-25T10:36:57Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.