Pure Monte Carlo Counterfactual Regret Minimization
- URL: http://arxiv.org/abs/2309.03084v3
- Date: Fri, 13 Oct 2023 06:01:17 GMT
- Title: Pure Monte Carlo Counterfactual Regret Minimization
- Authors: Ju Qi, Ting Feng, Falun Hei, Zhemei Fang, Yunfeng Luo
- Abstract summary: This paper proposes a new algorithm called Pure CFR (PCFR) based on CFR.
It can be seen as a combination of CFR and Fictitious Play (FP), inheriting the concept of counterfactual regret (value) from CFR.
It can significantly reduce the time and space complexity of one iteration.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Counterfactual Regret Minimization (CFR) and its variants are the best
algorithms so far for solving large-scale incomplete information games.
However, we believe that there are two problems with CFR: First, matrix
multiplication is required in CFR iteration, and the time complexity of one
iteration is too high; Secondly, the game characteristics in the real world are
different. Just using one CFR algorithm will not be perfectly suitable for all
game problems.
For these two problems, this paper proposes a new algorithm called Pure CFR
(PCFR) based on CFR. PCFR can be seen as a combination of CFR and Fictitious
Play (FP), inheriting the concept of counterfactual regret (value) from CFR,
and using the best response strategy instead of the regret matching strategy
for the next iteration. This algorithm has three advantages. First, PCFR can be
combined with any CFR variant. The resulting Pure MCCFR (PMCCFR) can
significantly reduce the time and space complexity of one iteration. Secondly,
our experiments show that the convergence speed of the PMCCFR is 2$\sim$3 times
that of the MCCFR. Finally, there is a type of game that is very suitable for
PCFR. We call this type of game clear-game, which is characterized by a high
proportion of dominated strategies. Experiments show that in clear-game, the
convergence rate of PMCCFR is two orders of magnitude higher than that of
MCCFR.
Related papers
- Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent [44.080852682765276]
This work explores minimizing weighted counterfactual regret with optimistic Online Mirror Descent (OMD)
It integrates PCFR+ and Discounted CFR (DCFR) in a principled manner, swiftly mitigating negative effects of dominated actions.
Experimental results demonstrate PDCFR+'s fast convergence in common imperfect-information games.
arXiv Detail & Related papers (2024-04-22T05:37:22Z) - Equivalence Analysis between Counterfactual Regret Minimization and
Online Mirror Descent [67.60077332154853]
Counterfactual Regret Minimization (CFR) is a regret minimization algorithm that minimizes the total regret by minimizing the local counterfactual regrets.
Follow-the-Regularized-Lead (FTRL) and Online Mirror Descent (OMD) algorithms are regret minimization algorithms in Online Convex Optimization.
We provide a new way to analyze and extend CFRs, by proving that CFR with Regret Matching and CFR with Regret Matching+ are special forms of FTRL and OMD.
arXiv Detail & Related papers (2021-10-11T02:12:25Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - NNCFR: Minimize Counterfactual Regret with Neural Networks [4.418221583366099]
This paper introduces textitNeural Network Counterfactual Regret Minimization (NNCFR), an improved variant of textitDeep CFR.
The textitNNCFR converges faster and performs more stable than textitDeep CFR, and outperforms textitDeep CFR with respect to exploitability and head-to-head performance on test games.
arXiv Detail & Related papers (2021-05-26T04:58:36Z) - Model-free Neural Counterfactual Regret Minimization with Bootstrap
Learning [10.816436463322237]
Current CFR algorithms have to approximate the cumulative regrets with neural networks.
A new CFR variant, Recursive CFR, is proposed, in which the cumulative regrets are recovered by Recursive Substitute Values (RSVs)
It is proved the new Recursive CFR converges to a Nash equilibrium.
Experimental results show that the new algorithm can match the state-of-the-art neural CFR algorithms but with less training overhead.
arXiv Detail & Related papers (2020-12-03T12:26:50Z) - Recurrent Feature Reasoning for Image Inpainting [110.24760191732905]
Recurrent Feature Reasoning (RFR) network is mainly constructed by a plug-and-play Recurrent Feature Reasoning module and a Knowledge Consistent Attention (KCA) module.
RFR module recurrently infers the hole boundaries of the convolutional feature maps and then uses them as clues for further inference.
To capture information from distant places in the feature map for RFR, we further develop KCA and incorporate it in RFR.
arXiv Detail & Related papers (2020-08-09T14:40:04Z) - Faster Game Solving via Predictive Blackwell Approachability: Connecting
Regret Matching and Mirror Descent [119.5481797273995]
Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the most prevalent regret minimizers in online convex optimization.
We show that RM and RM+ are the algorithms that result from running FTRL and OMD, respectively, to select the halfspace to force at all times in the underlying Blackwell approachability game.
In experiments across 18 common zero-sum extensive-form benchmark games, we show that predictive RM+ coupled with counterfactual regret minimization converges vastly faster than the fastest prior algorithms.
arXiv Detail & Related papers (2020-07-28T16:49:55Z) - Stochastic Regret Minimization in Extensive-Form Games [109.43344748069933]
Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full trees.
We develop a new framework for developing regret minimization methods.
We show extensive experiments on three games, where some variants of our methods outperform MCCFR.
arXiv Detail & Related papers (2020-02-19T23:05:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.