Self-evolving Autoencoder Embedded Q-Network
- URL: http://arxiv.org/abs/2402.11604v1
- Date: Sun, 18 Feb 2024 14:42:47 GMT
- Title: Self-evolving Autoencoder Embedded Q-Network
- Authors: J. Senthilnath, Bangjian Zhou, Zhen Wei Ng, Deeksha Aggarwal, Rajdeep
Dutta, Ji Wei Yoon, Aye Phyu Phyu Aung, Keyu Wu, Min Wu, Xiaoli Li
- Abstract summary: We propose SAQN, a self-evolving autoencoder embedded with a Q-Network.
In SAQN, the autoencoder architecture adapts and evolves as the agent explores the environment.
We show that the proposed SAQN significantly outperforms state-of-the-art counterparts.
- Score: 9.414875682358085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the realm of sequential decision-making tasks, the exploration capability
of a reinforcement learning (RL) agent is paramount for achieving high rewards
through interactions with the environment. To enhance this crucial ability, we
propose SAQN, a novel approach wherein a self-evolving autoencoder (SA) is
embedded with a Q-Network (QN). In SAQN, the self-evolving autoencoder
architecture adapts and evolves as the agent explores the environment. This
evolution enables the autoencoder to capture a diverse range of raw
observations and represent them effectively in its latent space. By leveraging
the disentangled states extracted from the encoder generated latent space, the
QN is trained to determine optimal actions that improve rewards. During the
evolution of the autoencoder architecture, a bias-variance regulatory strategy
is employed to elicit the optimal response from the RL agent. This strategy
involves two key components: (i) fostering the growth of nodes to retain
previously acquired knowledge, ensuring a rich representation of the
environment, and (ii) pruning the least contributing nodes to maintain a more
manageable and tractable latent space. Extensive experimental evaluations
conducted on three distinct benchmark environments and a real-world molecular
environment demonstrate that the proposed SAQN significantly outperforms
state-of-the-art counterparts. The results highlight the effectiveness of the
self-evolving autoencoder and its collaboration with the Q-Network in tackling
sequential decision-making tasks.
Related papers
- Multi-agent Reinforcement Learning with Deep Networks for Diverse Q-Vectors [3.9801926395657325]
This paper proposes a deep Q-networks (DQN) algorithm capable of learning various Q-vectors using Max, Nash, and Maximin strategies.
The effectiveness of this approach is demonstrated in an environment where dual robotic arms collaborate to lift a pot.
arXiv Detail & Related papers (2024-06-12T03:30:10Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Self Expanding Convolutional Neural Networks [1.4330085996657045]
We present a novel method for dynamically expanding Convolutional Neural Networks (CNNs) during training.
We employ a strategy where a single model is dynamically expanded, facilitating the extraction of checkpoints at various complexity levels.
arXiv Detail & Related papers (2024-01-11T06:22:40Z) - Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets)
Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Graded-Q Reinforcement Learning with Information-Enhanced State Encoder
for Hierarchical Collaborative Multi-Vehicle Pursuit [11.195170949292496]
The multi-vehicle pursuit (MVP) is becoming a hot research topic in Intelligent Transportation System (ITS)
This paper proposed a graded-Q reinforcement learning with information-enhanced state encoder (GQRL-IESE) framework to address this hierarchical collaborative pursuit problem.
In the GQRL-IESE, a cooperative graded Q scheme is proposed to facilitate the decision-making of pursuing vehicles to improve pursuing efficiency.
arXiv Detail & Related papers (2022-10-24T16:35:34Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.