Formalizing the Generalization-Forgetting Trade-off in Continual
Learning
- URL: http://arxiv.org/abs/2109.14035v1
- Date: Tue, 28 Sep 2021 20:39:04 GMT
- Title: Formalizing the Generalization-Forgetting Trade-off in Continual
Learning
- Authors: Krishnan Raghavan, Prasanna Balaprakash
- Abstract summary: We model the trade-off between catastrophic forgetting and generalization as a two-player sequential game.
We show theoretically that a balance point between the two players exists for each task and that this point is stable.
Next, we introduce balanced continual learning (BCL), which is designed to attain balance between generalization and forgetting.
- Score: 1.370633147306388
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We formulate the continual learning (CL) problem via dynamic programming and
model the trade-off between catastrophic forgetting and generalization as a
two-player sequential game. In this approach, player 1 maximizes the cost due
to lack of generalization whereas player 2 minimizes the cost due to
catastrophic forgetting. We show theoretically that a balance point between the
two players exists for each task and that this point is stable (once the
balance is achieved, the two players stay at the balance point). Next, we
introduce balanced continual learning (BCL), which is designed to attain
balance between generalization and forgetting and empirically demonstrate that
BCL is comparable to or better than the state of the art.
Related papers
- No-regret learning in harmonic games: Extrapolation in the face of conflicting interests [45.94247914236653]
We show that learning converges to a Nash equilibrium from any initial condition, and all players are guaranteed at most O(1) regret.
Results provide an in-depth understanding of no-regret learning in harmonic games.
arXiv Detail & Related papers (2024-12-28T16:28:13Z) - Corrupted Learning Dynamics in Games [62.73758165845971]
An equilibrium can be computed at a fast rate of $O(log T)$ when all players follow the optimistic follow-the-regularized-leader (OFTRL)
We present corrupted learning dynamics that adaptively find an equilibrium at a rate that depends on the extent to which each player deviates from the strategy suggested by the prescribed algorithm.
arXiv Detail & Related papers (2024-12-10T02:23:44Z) - Barriers to Welfare Maximization with No-Regret Learning [68.66209476382213]
We prove lower bounds for computing a near-optimal $T$-sparse CCE.
In particular, we show that the inapproximability of maximum clique precludes attaining any non-trivial sparsity in time.
arXiv Detail & Related papers (2024-11-04T00:34:56Z) - $\widetilde{O}(T^{-1})$ Convergence to (Coarse) Correlated Equilibria in Full-Information General-Sum Markov Games [8.215874655947335]
We show that an optimistic-follow-the-regularized-leader algorithm can find $widetildeO(T-1)$-approximate iterations in full-information general-sum Markov games within $T$.
arXiv Detail & Related papers (2024-02-02T20:40:27Z) - Optimistic Policy Gradient in Multi-Player Markov Games with a Single
Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller.
Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z) - Differentiable Arbitrating in Zero-sum Markov Games [59.62061049680365]
We study how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating.
The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way.
We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level.
arXiv Detail & Related papers (2023-02-20T16:05:04Z) - How Bad is Selfish Driving? Bounding the Inefficiency of Equilibria in
Urban Driving Games [64.71476526716668]
We study the (in)efficiency of any equilibrium players might agree to play.
We obtain guarantees that refine existing bounds on the Price of Anarchy.
Although the obtained guarantees concern open-loop trajectories, we observe efficient equilibria even when agents employ closed-loop policies.
arXiv Detail & Related papers (2022-10-24T09:32:40Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Survival of the strictest: Stable and unstable equilibria under
regularized learning with partial information [32.384868685390906]
We examine the Nash equilibrium convergence properties of no-regret learning in general N-player games.
We establish a comprehensive equivalence between the stability of a Nash equilibrium and its support.
It provides a clear refinement criterion for the prediction of the day-to-day behavior of no-regret learning in games.
arXiv Detail & Related papers (2021-01-12T18:55:11Z) - Game-theoretic Models of Moral and Other-Regarding Agents [0.0]
We highlight a number of problems with such equilibria, including computational intractability, a high price of miscoordination, and expensive/problematic extension to general normal form games.
We propose some general, intuitive, computationally tractable, other-regarding equilibria related to Kantian equilibria, as well as a class of courses of action that interpolates between purely self-regarding and Kantian behavior.
arXiv Detail & Related papers (2020-12-17T17:16:50Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - No-regret learning and mixed Nash equilibria: They do not mix [64.37511607254115]
We study the dynamics of "follow-the-regularized-leader" (FTRL)
We show that any Nash equilibrium which is not strict cannot be stable and attracting under FTRL.
This result has significant implications for predicting the outcome of a learning process.
arXiv Detail & Related papers (2020-10-19T13:49:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.