Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2211.04813v1
- Date: Wed, 9 Nov 2022 11:22:02 GMT
- Title: Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep
Reinforcement Learning
- Authors: Jernej Hribar and Luke Hackett and Ivana Dusparic
- Abstract summary: We build on advances introduced by the Deep Q-Networks (DQN) approach to extend the W-learning algorithm to large state spaces.
We evaluate the resulting Deep W-Networks (DWN) approach in two widely-accepted multi-objective RL benchmarks: deep sea treasure and multi-objective mountain car.
- Score: 2.65558931169264
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we build on advances introduced by the Deep Q-Networks (DQN)
approach to extend the multi-objective tabular Reinforcement Learning (RL)
algorithm W-learning to large state spaces. W-learning algorithm can naturally
solve the competition between multiple single policies in multi-objective
environments. However, the tabular version does not scale well to environments
with large state spaces. To address this issue, we replace underlying Q-tables
with DQN, and propose an addition of W-Networks, as a replacement for tabular
weights (W) representations. We evaluate the resulting Deep W-Networks (DWN)
approach in two widely-accepted multi-objective RL benchmarks: deep sea
treasure and multi-objective mountain car. We show that DWN solves the
competition between multiple policies while outperforming the baseline in the
form of a DQN solution. Additionally, we demonstrate that the proposed
algorithm can find the Pareto front in both tested environments.
Related papers
- Weakly Coupled Deep Q-Networks [5.76924666595801]
We propose a novel deep reinforcement learning algorithm that enhances performance in weakly coupled Markov decision processes (WCMDP)
WCDQN employs a single network to train multiple DQN "subagents", one for each subproblem, and then combine their solutions to establish an upper bound on the optimal action value.
arXiv Detail & Related papers (2023-10-28T20:07:57Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Hierarchical Multi-Marginal Optimal Transport for Network Alignment [52.206006379563306]
Multi-network alignment is an essential prerequisite for joint learning on multiple networks.
We propose a hierarchical multi-marginal optimal transport framework named HOT for multi-network alignment.
Our proposed HOT achieves significant improvements over the state-of-the-art in both effectiveness and scalability.
arXiv Detail & Related papers (2023-10-06T02:35:35Z) - Multi Agent DeepRL based Joint Power and Subchannel Allocation in IAB
networks [0.0]
Integrated Access and Backhauling (IRL) is a viable approach for meeting the unprecedented need for higher data rates of future generations.
In this paper, we show how we can use Deep Q-Learning Network to handle problems with huge action spaces associated with fractional nodes.
arXiv Detail & Related papers (2023-08-31T21:30:25Z) - Vector Quantized Wasserstein Auto-Encoder [57.29764749855623]
We study learning deep discrete representations from the generative viewpoint.
We endow discrete distributions over sequences of codewords and learn a deterministic decoder that transports the distribution over the sequences of codewords to the data distribution.
We develop further theories to connect it with the clustering viewpoint of WS distance, allowing us to have a better and more controllable clustering solution.
arXiv Detail & Related papers (2023-02-12T13:51:36Z) - Pareto Conditioned Networks [1.7188280334580197]
We propose a method that uses a single neural network to encompass all non-dominated policies.
PCN associates every past transition with its episode's return and trains the network such that, when conditioned on this same return, it should reenact said transition.
Our method is stable as it learns in a supervised fashion, thus avoiding moving target issues.
arXiv Detail & Related papers (2022-04-11T12:09:51Z) - Deep Policy Dynamic Programming for Vehicle Routing Problems [89.96386273895985]
We propose Deep Policy Dynamic Programming (D PDP) to combine the strengths of learned neurals with those of dynamic programming algorithms.
D PDP prioritizes and restricts the DP state space using a policy derived from a deep neural network, which is trained to predict edges from example solutions.
We evaluate our framework on the travelling salesman problem (TSP) and the vehicle routing problem (VRP) and show that the neural policy improves the performance of (restricted) DP algorithms.
arXiv Detail & Related papers (2021-02-23T15:33:57Z) - DeepSlicing: Deep Reinforcement Learning Assisted Resource Allocation
for Network Slicing [20.723527476555574]
Network slicing enables multiple virtual networks run on the same physical infrastructure to support various use cases in 5G and beyond.
These use cases have very diverse network resource demands, e.g., communication and computation, and various performance metrics such as latency and throughput.
We propose DeepSlicing that integrates the alternating direction method of multipliers (ADMM) and deep reinforcement learning (DRL)
arXiv Detail & Related papers (2020-08-17T20:52:19Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.