Memory Undone: Between Knowing and Not Knowing in Data Systems
- URL: http://arxiv.org/abs/2602.21180v1
- Date: Tue, 24 Feb 2026 18:29:17 GMT
- Title: Memory Undone: Between Knowing and Not Knowing in Data Systems
- Authors: Viktoriia Makovska, George Fletcher, Julia Stoyanovich, Tetiana Zakharchenko,
- Abstract summary: We show how forgetting can simultaneously protect rights and enable silencing.<n>We propose reframing unlearning as a first-class capability in knowledge infrastructures.
- Score: 14.165847961943193
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning and data systems increasingly function as infrastructures of memory: they ingest, store, and operationalize traces of personal, political, and cultural life. Yet contemporary governance demands credible forms of forgetting, from GDPR-backed deletion to harm-mitigation and the removal of manipulative content, while technical infrastructures are optimized to retain, replicate, and reuse. This work argues that "forgetting" in computational systems cannot be reduced to a single operation (e.g., record deletion) and should instead be treated as a sociotechnical practice with distinct mechanisms and consequences. We clarify a vocabulary that separates erasure (removing or disabling access to data artifacts), unlearning (interventions that bound or remove a data point influence on learned parameters and outputs), exclusion (upstream non-collection and omission), and forgetting as an umbrella term spanning agency, temporality, reversibility, and scale. Building on examples from machine unlearning, semantic dependencies in data management, participatory data modeling, and manipulation at scale, we show how forgetting can simultaneously protect rights and enable silencing. We propose reframing unlearning as a first-class capability in knowledge infrastructures, evaluated not only by compliance or utility retention, but by its governance properties: transparency, accountability, and epistemic justice.
Related papers
- Rethinking Benign Relearning: Syntax as the Hidden Driver of Unlearning Failures [6.583686018711596]
We study the phenomenon of benign relearning, in which forgotten information reemerges even from benign fine-tuning data.<n>A common explanation attributes this effect to topical relevance, but we find this account insufficient.<n>We introduce syntactic diversification, which paraphrases the original forget queries into heterogeneous structures prior to unlearning.<n>This approach effectively suppresses benign relearning, accelerates forgetting, and substantially alleviates the trade-off between unlearning efficacy and model utility.
arXiv Detail & Related papers (2026-02-03T10:57:19Z) - Representation Unlearning: Forgetting through Information Compression [3.9189279162842854]
We introduce Representation Unlearning, a framework that performs unlearning directly in the model's representation space.<n>We show that Representation Unlearning achieves more reliable forgetting, better utility retention, and greater computational efficiency than parameter-centric baselines.
arXiv Detail & Related papers (2026-01-29T11:28:02Z) - Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning [9.512928441517811]
Foundation models have transformed multimedia analysis by enabling robust and transferable representations across diverse modalities and tasks.<n>Traditional unlearning approaches, including retraining, activation editing, or distillation, are often expensive, fragile, and ill-suited for real-time or continuously evolving systems.<n>We introduce a prompt-based learning framework that unifies knowledge acquisition and removal within a single training phase.
arXiv Detail & Related papers (2025-09-05T13:28:04Z) - MemOS: A Memory OS for AI System [116.87568350346537]
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
arXiv Detail & Related papers (2025-07-04T17:21:46Z) - GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models [17.83305806604326]
GUARD is a framework for guided unlearning and retention via data attribution.<n>It assigns adaptive, nonuniform unlearning weights to samples, inversely proportional to their proxy attribution scores.<n>We provide rigorous theoretical guarantees that GUARD significantly improves retention while maintaining forgetting metrics comparable to prior methods.
arXiv Detail & Related papers (2025-06-12T17:49:09Z) - Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning [7.557226714828334]
We present a novel unlearning mechanism designed to remove the impact of specific data samples from a neural network.
In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model.
Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task.
arXiv Detail & Related papers (2024-07-01T00:20:26Z) - UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI [50.61495097098296]
We revisit the paradigm in which unlearning is used for Large Language Models (LLMs)
We introduce a concept of ununlearning, where unlearned knowledge gets reintroduced in-context.
We argue that content filtering for impermissible knowledge will be required and even exact unlearning schemes are not enough for effective content regulation.
arXiv Detail & Related papers (2024-06-27T10:24:35Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Unlearn What You Want to Forget: Efficient Unlearning for LLMs [92.51670143929056]
Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data.
This process might suffer from privacy issues and violations of data protection regulations.
We propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals.
arXiv Detail & Related papers (2023-10-31T03:35:59Z) - Fast Machine Unlearning Without Retraining Through Selective Synaptic
Dampening [51.34904967046097]
Selective Synaptic Dampening (SSD) is a fast, performant, and does not require long-term storage of the training data.
We present a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data.
arXiv Detail & Related papers (2023-08-15T11:30:45Z) - Fair Machine Unlearning: Data Removal while Mitigating Disparities [5.724350004671127]
Right to be forgotten is core principle outlined by EU's General Regulation.
"Forgetting" can be naively achieved by retraining on remaining data.
"Unlearning" impacts other properties critical to real-world applications such as fairness.
arXiv Detail & Related papers (2023-07-27T10:26:46Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.