Related papers: Coverage Analysis for Digital Cousin Selection -- Improving Multi-Environment Q-Learning

Coverage Analysis for Digital Cousin Selection -- Improving Multi-Environment Q-Learning

URL: http://arxiv.org/abs/2411.08360v1
Date: Wed, 13 Nov 2024 06:16:12 GMT
Title: Coverage Analysis for Digital Cousin Selection -- Improving Multi-Environment Q-Learning
Authors: Talha Bozkus, Tara Javidi, Urbashi Mitra,
Abstract summary: Recent advancements include multi-environment mixed Q-learning (MEMQ) algorithms. MEMQ algorithms outperform several state-of-the-art Q-learning algorithms in terms of accuracy, complexity, and robustness. We present a novel CC-based MEMQ algorithm to improve the accuracy and complexity of existing MEMQ algorithms.
Score: 24.212773534280387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Q-learning is widely employed for optimizing various large-dimensional networks with unknown system dynamics. Recent advancements include multi-environment mixed Q-learning (MEMQ) algorithms, which utilize multiple independent Q-learning algorithms across multiple, structurally related but distinct environments and outperform several state-of-the-art Q-learning algorithms in terms of accuracy, complexity, and robustness. We herein conduct a comprehensive probabilistic coverage analysis to ensure optimal data coverage conditions for MEMQ algorithms. First, we derive upper and lower bounds on the expectation and variance of different coverage coefficients (CC) for MEMQ algorithms. Leveraging these bounds, we develop a simple way of comparing the utilities of multiple environments in MEMQ algorithms. This approach appears to be near optimal versus our previously proposed partial ordering approach. We also present a novel CC-based MEMQ algorithm to improve the accuracy and complexity of existing MEMQ algorithms. Numerical experiments are conducted using random network graphs with four different graph properties. Our algorithm can reduce the average policy error (APE) by 65% compared to partial ordering and is 95% faster than the exhaustive search. It also achieves 60% less APE than several state-of-the-art reinforcement learning and prior MEMQ algorithms. Additionally, we numerically verify the theoretical results and show their scalability with the action-space size.

Related papers

Generative Multi-Agent Q-Learning for Policy Optimization: Decentralized Wireless Networks [18.035417008213077]
We propose a novel multi-agent MEMQ (M-MEMQ) for cooperative decentralized wireless networks. In uncoordinated states, TXs act independently to minimize their individual costs and update local Q-functions. M-MEMQ achieves 55% lower average policy error (APE), 35% faster convergence, 50% reduced runtime complexity, and 45% less sample complexity.
arXiv Detail & Related papers (2025-03-07T22:48:35Z)
Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis [30.713243690224207]
In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in MDPs with strong convergence and performance guarantees.
arXiv Detail & Related papers (2024-10-31T16:53:20Z)
Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization [18.035417008213077]
Recent advancements include ensemble multi-environment hybrid Q-learning algorithms. We show that our algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms.
arXiv Detail & Related papers (2024-08-29T20:09:20Z)
Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization [21.30645601474163]
Original Q-learning suffers from performance and complexity challenges across very large networks. New model-free ensemble reinforcement learning algorithm which adapts the classical Q-learning is proposed to handle these challenges. Numerical results show that the proposed algorithm can achieve up to 55% less average policy error with up to 50% less runtime complexity.
arXiv Detail & Related papers (2024-02-08T08:08:23Z)
Multi-Dimensional Ability Diagnosis for Machine Learning Algorithms [88.93372675846123]
We propose a task-agnostic evaluation framework Camilla for evaluating machine learning algorithms. We use cognitive diagnosis assumptions and neural networks to learn the complex interactions among algorithms, samples and the skills of each sample. In our experiments, Camilla outperforms state-of-the-art baselines on the metric reliability, rank consistency and rank stability.
arXiv Detail & Related papers (2023-07-14T03:15:56Z)
SDQ: Stochastic Differentiable Quantization with Mixed Precision [46.232003346732064]
We present a novel Differentiable Quantization (SDQ) method that can automatically learn the MPQ strategy. After the optimal MPQ strategy is acquired, we train our network with entropy-aware bin regularization and knowledge distillation. SDQ outperforms all state-of-the-art mixed datasets or single precision quantization with a lower bitwidth.
arXiv Detail & Related papers (2022-06-09T12:38:18Z)
A survey on multi-objective hyperparameter optimization algorithms for Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms. We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both. We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z)
Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity [0.0]
Maxmin and Ensemble Q-learning algorithms have used different estimates provided by the ensembles of learners to reduce the overestimation bias. Unfortunately, these learners can converge to the same point in the parametric or representation space, falling back to the classic single neural network DQN. We propose and compare five regularization functions inspired from economics theory and consensus optimization.
arXiv Detail & Related papers (2020-06-24T15:53:20Z)
Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed. An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed. We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms. We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.