FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning
- URL: http://arxiv.org/abs/2601.13578v1
- Date: Tue, 20 Jan 2026 04:05:13 GMT
- Title: FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning
- Authors: Qian Feng, JiaHang Tu, Mintong Kang, Hanbin Zhao, Chao Zhang, Hui Qian,
- Abstract summary: Existing methods suppress parameters or confuse knowledge without explicit constraints on both feature and gradient level.<n>We propose FG-OrIU (textbfFeaturetextbfGradient textbfOrthogonality for textbfIncrementaltextbfUnlearning)<n>It decomposes feature spaces via Singular Value Decomposition (SVD), separating forgetting and remaining class features into distinct subspaces.
- Score: 24.195588298488314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Incremental unlearning (IU) is critical for pre-trained models to comply with sequential data deletion requests, yet existing methods primarily suppress parameters or confuse knowledge without explicit constraints on both feature and gradient level, resulting in \textit{superficial forgetting} where residual information remains recoverable. This incomplete forgetting risks security breaches and disrupts retention balance, especially in IU scenarios. We propose FG-OrIU (\textbf{F}eature-\textbf{G}radient \textbf{Or}thogonality for \textbf{I}ncremental \textbf{U}nlearning), the first framework unifying orthogonal constraints on both features and gradients level to achieve deep forgetting, where the forgetting effect is irreversible. FG-OrIU decomposes feature spaces via Singular Value Decomposition (SVD), separating forgetting and remaining class features into distinct subspaces. It then enforces dual constraints: feature orthogonal projection on both forgetting and remaining classes, while gradient orthogonal projection prevents the reintroduction of forgotten knowledge and disruption to remaining classes during updates. Additionally, dynamic subspace adaptation merges newly forgetting subspaces and contracts remaining subspaces, ensuring a stable balance between removal and retention across sequential unlearning tasks. Extensive experiments demonstrate the effectiveness of our method.
Related papers
- FIT: Defying Catastrophic Forgetting in Continual LLM Unlearning [23.471857001200465]
fit is a framework for continual unlearning that handles large numbers of deletion requests.<n>fit mitigates degradation through rigorous data underlineFiltering, underlineImportance-aware updates, and underlineTargeted layer attribution.<n>fit surpasses existing methods on MMLU, CommonsenseQA, and GSM8K, and remains resistant against both relearning and quantization recovery attacks.
arXiv Detail & Related papers (2026-01-29T13:15:32Z) - Geometric-Disentangelment Unlearning [106.99160454669902]
gradient ascent on forget samples often harms retained knowledge.<n>We propose the Geometric-disment Unlearning (GU) that decomposes any candidate forget gradient update into tangential and normal components to retain space and executes only the normal component.<n>Our method is plug-and-play and can be attached to existing gradient-based unlearning procedures to mitigate side effects.
arXiv Detail & Related papers (2025-11-21T09:58:25Z) - Retrofit: Continual Learning with Bounded Forgetting for Security Applications [25.185616916987158]
We propose RETROFIT, a data retrospective-free continual learning method that achieves bounded forgetting for effective knowledge transfer.<n>To mitigate interference, we apply low-rank and sparse updates that confine parameter changes to independent subspaces.<n>In malware detection under temporal drift, it substantially improves the retention score, from 20.2% to 38.6% over CL baselines, and exceeds the oracle upper bound on new data.
arXiv Detail & Related papers (2025-11-14T16:07:03Z) - Regularizing Subspace Redundancy of Low-Rank Adaptation [54.473090597164834]
We propose ReSoRA, a method that explicitly models redundancy between mapping subspaces and adaptively Regularizes Subspace redundancy of Low-Rank Adaptation.<n>Our proposed method consistently facilitates existing state-of-the-art PETL methods across various backbones and datasets in vision-language retrieval and standard visual classification benchmarks.<n>As a training supervision, ReSoRA can be seamlessly integrated into existing approaches in a plug-and-play manner, with no additional inference costs.
arXiv Detail & Related papers (2025-07-28T11:52:56Z) - Constraint-Guided Prediction Refinement via Deterministic Diffusion Trajectories [7.279433512595361]
We propose a general-purpose framework for constraint-aware guided denoising diffusion diffusionDDIMs.<n>Our method iteratively refines it through a diffusion trajectory by a learned prior and augmented by constraint corrections.
arXiv Detail & Related papers (2025-06-15T17:02:07Z) - Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models [14.321060805197874]
Large Language Models (LLMs) deployed in real-world settings increasingly face the need to unlearn sensitive, outdated, or proprietary information.<n>Existing unlearning methods formulate forgetting and retention as a regularized trade-off, combining both objectives into a single scalarized loss.<n>We propose a new formulation of LLM unlearning as a constrained optimization problem: forgetting is enforced via a novel logit-margin flattening loss.
arXiv Detail & Related papers (2025-06-05T17:55:23Z) - Continuous Knowledge-Preserving Decomposition with Adaptive Layer Selection for Few-Shot Class-Incremental Learning [73.59672160329296]
CKPD-FSCIL is a unified framework that unlocks the underutilized capacity of pretrained weights.<n>Our method consistently outperforms state-of-the-art approaches in both adaptability and knowledge retention.
arXiv Detail & Related papers (2025-01-09T07:18:48Z) - OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning [57.43911113915546]
Few-Shot Class-Incremental Learning (FSCIL) introduces a paradigm in which the problem space expands with limited data.
FSCIL methods inherently face the challenge of catastrophic forgetting as data arrives incrementally.
We propose the OrCo framework built on two core principles: features' orthogonality in the representation space, and contrastive learning.
arXiv Detail & Related papers (2024-03-27T13:30:48Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - GIFD: A Generative Gradient Inversion Method with Feature Domain
Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy.
Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge.
We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z) - Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent (GD) is a powerful workhorse of modern machine learning.
GD's ability to find local minimisers is only guaranteed for losses with Lipschitz gradients.
This work focuses on simple, yet representative, learning problems via analysis of two-step gradient updates.
arXiv Detail & Related papers (2022-06-08T21:32:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.