Related papers: Forgetting is Everywhere

Forgetting is Everywhere

URL: http://arxiv.org/abs/2511.04666v1
Date: Thu, 06 Nov 2025 18:52:57 GMT
Title: Forgetting is Everywhere
Authors: Ben Sanati, Thomas L. Lee, Trevor McInroe, Aidan Scannell, Nikolay Malkin, David Abel, Amos Storkey,
Abstract summary: We propose an algorithm- and task-agnostic theory that characterises forgetting as a lack of self-consistency in a learner's predictive distribution over future experiences.<n>Our theory naturally yields a general measure of an algorithm's propensity to forget.<n>We empirically demonstrate how forgetting is present across all learning settings and plays a significant role in determining learning efficiency.
Score: 19.22572725623779
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A fundamental challenge in developing general learning algorithms is their tendency to forget past knowledge when adapting to new data. Addressing this problem requires a principled understanding of forgetting; yet, despite decades of study, no unified definition has emerged that provides insights into the underlying dynamics of learning. We propose an algorithm- and task-agnostic theory that characterises forgetting as a lack of self-consistency in a learner's predictive distribution over future experiences, manifesting as a loss of predictive information. Our theory naturally yields a general measure of an algorithm's propensity to forget. To validate the theory, we design a comprehensive set of experiments that span classification, regression, generative modelling, and reinforcement learning. We empirically demonstrate how forgetting is present across all learning settings and plays a significant role in determining learning efficiency. Together, these results establish a principled understanding of forgetting and lay the foundation for analysing and improving the information retention capabilities of general learning algorithms.

Related papers

Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence [59.07578850674114]
Sound deductive reasoning is an indisputably desirable aspect of general intelligence.<n>It is well-documented that even the most advanced frontier systems regularly and consistently falter on easily-solvable reasoning tasks.<n>We argue that their unsound behavior is a consequence of the statistical learning approach powering their development.
arXiv Detail & Related papers (2025-06-30T14:37:50Z)
Learning principle and mathematical realization of the learning mechanism in the brain [0.0]
We call it learning principle, and it follows that all learning is equivalent to estimating the probability of input data. We show that conventional supervised learning is equivalent to estimating conditional probabilities, and succeeded in making supervised learning more effective and generalized. We propose a new method of defining the values of estimated probability using differentiation, and show that unsupervised learning can be performed on arbitrary dataset without any prior knowledge.
arXiv Detail & Related papers (2023-11-22T12:08:01Z)
Ticketed Learning-Unlearning Schemes [57.89421552780526]
We propose a new ticketed model for learning--unlearning. We provide space-efficient ticketed learning--unlearning schemes for a broad family of concept classes.
arXiv Detail & Related papers (2023-06-27T18:54:40Z)
To Compress or Not to Compress- Self-Supervised Learning and Information Theory: A Review [30.87092042943743]
Deep neural networks excel in supervised learning tasks but are constrained by the need for extensive labeled data. Self-supervised learning emerges as a promising alternative, allowing models to learn without explicit labels. Information theory, and notably the information bottleneck principle, has been pivotal in shaping deep neural networks.
arXiv Detail & Related papers (2023-04-19T00:33:59Z)
A Comprehensive Survey of Continual Learning: Theory, Method and Application [64.23253420555989]
We present a comprehensive survey of continual learning, seeking to bridge the basic settings, theoretical foundations, representative methods, and practical applications. We summarize the general objectives of continual learning as ensuring a proper stability-plasticity trade-off and an adequate intra/inter-task generalizability in the context of resource efficiency.
arXiv Detail & Related papers (2023-01-31T11:34:56Z)
Understanding Self-Predictive Learning for Reinforcement Learning [61.62067048348786]
We study the learning dynamics of self-predictive learning for reinforcement learning. We propose a novel self-predictive algorithm that learns two representations simultaneously.
arXiv Detail & Related papers (2022-12-06T20:43:37Z)
Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain. It tackles the problem from two aspects: extracting knowledge and memorizing knowledge. It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z)
Towards a General Pre-training Framework for Adaptive Learning in MOOCs [37.570119583573955]
We propose a unified framework based on data observation and learning style analysis, properly leveraging heterogeneous learning elements. We find that course structures, text, and knowledge are helpful for modeling and inherently coherent to student non-sequential learning behaviors.
arXiv Detail & Related papers (2022-07-18T13:18:39Z)
Towards a theory of out-of-distribution learning [23.878004729029644]
We propose a chronological approach to defining different learning tasks using the provably approximately correct (PAC) learning framework. We will start with in-distribution learning and progress to recently proposed lifelong or continual learning. Our hope is that this work will inspire a universally agreed-upon approach to quantifying different types of learning.
arXiv Detail & Related papers (2021-09-29T15:35:16Z)
Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory to Learning Algorithms [91.3755431537592]
We analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression. We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice.
arXiv Detail & Related papers (2021-01-26T17:11:40Z)
A Theory of Universal Learning [26.51949485387526]
We show that there are only three possible rates of universal learning. We show that the learning curves of any given concept class decay either at an exponential, or arbitrarily slow rates.
arXiv Detail & Related papers (2020-11-09T15:10:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.