Related papers: GenCos' Behaviors Modeling Based on Q Learning Improved by Dichotomy

GenCos' Behaviors Modeling Based on Q Learning Improved by Dichotomy

URL: http://arxiv.org/abs/2008.01536v1
Date: Tue, 4 Aug 2020 13:48:09 GMT
Title: GenCos' Behaviors Modeling Based on Q Learning Improved by Dichotomy
Authors: Qiangang Jia, Zhaoyu Hu, Yiyan Li, Zheng Yan, Sijie Chen
Abstract summary: A novel Q learning algorithm is proposed in this paper. It modifies the update process of the Q table by dichotomizing the state space and the action space step by step. Simulation results in a repeated Cournot game show the effectiveness of the proposed algorithm.
Score: 3.14969586104215
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Q learning is widely used to simulate the behaviors of generation companies (GenCos) in an electricity market. However, existing Q learning method usually requires numerous iterations to converge, which is time-consuming and inefficient in practice. To enhance the calculation efficiency, a novel Q learning algorithm improved by dichotomy is proposed in this paper. This method modifies the update process of the Q table by dichotomizing the state space and the action space step by step. Simulation results in a repeated Cournot game show the effectiveness of the proposed algorithm.

Related papers

Preventing Local Pitfalls in Vector Quantization via Optimal Transport [77.15924044466976]
We introduce OptVQ, a novel vector quantization method that employs the Sinkhorn algorithm to optimize the optimal transport problem. Our experiments on image reconstruction tasks demonstrate that OptVQ achieves 100% codebook utilization and surpasses current state-of-the-art VQNs in reconstruction quality.
arXiv Detail & Related papers (2024-12-19T18:58:14Z)
Two-Step Q-Learning [0.0]
The paper proposes a novel off-policy two-step Q-learning algorithm, without importance sampling. Numerical experiments demonstrate the superior performance of both the two-step Q-learning and its smooth variants.
arXiv Detail & Related papers (2024-07-02T15:39:00Z)
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE [68.6018458996143]
We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
arXiv Detail & Related papers (2024-06-20T15:25:13Z)
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs [50.75812033462294]
We bridge the gap between practical success of Q-learning and pessimistic theoretical results. We present novel methods Q-Rex and Q-RexDaRe. We show that Q-Rex efficiently finds the optimal policy for linear MDPs.
arXiv Detail & Related papers (2021-10-16T01:47:41Z)
Smooth Q-learning: Accelerate Convergence of Q-learning Using Similarity [2.088376060651494]
The similarity between different states and actions is considered in the proposed method. During the training, a new updating mechanism is used, in which the Q value of the similar state-action pairs are updated synchronously.
arXiv Detail & Related papers (2021-06-02T13:05:24Z)
Self-correcting Q-Learning [14.178899938667161]
We introduce a new way to address the bias in the form of a "self-correcting algorithm" Applying this strategy to Q-learning results in Self-correcting Q-learning. We show theoretically that this new algorithm enjoys the same convergence guarantees as Q-learning while being more accurate.
arXiv Detail & Related papers (2020-12-02T11:36:24Z)
Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods. Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z)
Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient [51.880464915253924]
Deep Q-learning algorithms often suffer from poor gradient estimations with an excessive variance. This paper introduces the framework for updating the gradient estimates in deep Q-learning, achieving a novel algorithm called SRG-DQN.
arXiv Detail & Related papers (2020-07-25T00:54:20Z)
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent [47.3692506462581]
We first characterize the convergence rate for Q-AMSGrad, which is the Q-learning algorithm with AMSGrad update. To further improve the performance, we propose to incorporate the momentum restart scheme to Q-AMSGrad, resulting in the so-called Q-AMSGradR algorithm. Our experiments on a linear quadratic regulator problem show that the two proposed Q-learning algorithms outperform the vanilla Q-learning with SGD updates.
arXiv Detail & Related papers (2020-07-15T01:11:43Z)
AdaS: Adaptive Scheduling of Stochastic Gradients [50.80697760166045]
We introduce the notions of textit"knowledge gain" and textit"mapping condition" and propose a new algorithm called Adaptive Scheduling (AdaS) Experimentation reveals that, using the derived metrics, AdaS exhibits: (a) faster convergence and superior generalization over existing adaptive learning methods; and (b) lack of dependence on a validation set to determine when to stop training.
arXiv Detail & Related papers (2020-06-11T16:36:31Z)
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping [4.5497948012757865]
We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.
arXiv Detail & Related papers (2020-01-15T19:13:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.