Related papers: Model Editing with Graph-Based External Memory

Model Editing with Graph-Based External Memory

URL: http://arxiv.org/abs/2505.18343v1
Date: Fri, 23 May 2025 19:57:51 GMT
Title: Model Editing with Graph-Based External Memory
Authors: Yash Kumar Atri, Ahmed Alaa, Thomas Hartvigsen,
Abstract summary: We propose a novel framework that leverages hyperbolic geometry and graph neural networks for precise and stable model edits.<n> Experiments on CounterFact, CounterFact+, and MQuAKE with GPT-J and GPT2-XL demonstrate that HYPE significantly enhances edit stability, factual accuracy, and multi-hop reasoning.
Score: 12.694485038895813
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large language models (LLMs) have revolutionized natural language processing, yet their practical utility is often limited by persistent issues of hallucinations and outdated parametric knowledge. Although post-training model editing offers a pathway for dynamic updates, existing methods frequently suffer from overfitting and catastrophic forgetting. To tackle these challenges, we propose a novel framework that leverages hyperbolic geometry and graph neural networks for precise and stable model edits. We introduce HYPE (HYperbolic Parameter Editing), which comprises three key components: (i) Hyperbolic Graph Construction, which uses Poincar\'e embeddings to represent knowledge triples in hyperbolic space, preserving hierarchical relationships and preventing unintended side effects by ensuring that edits to parent concepts do not inadvertently affect child concepts; (ii) M\"obius-Transformed Updates, which apply hyperbolic addition to propagate edits while maintaining structural consistency within the hyperbolic manifold, unlike conventional Euclidean updates that distort relational distances; and (iii) Dual Stabilization, which combines gradient masking and periodic GNN parameter resetting to prevent catastrophic forgetting by focusing updates on critical parameters and preserving long-term knowledge. Experiments on CounterFact, CounterFact+, and MQuAKE with GPT-J and GPT2-XL demonstrate that HYPE significantly enhances edit stability, factual accuracy, and multi-hop reasoning.

Related papers

Words & Weights: Streamlining Multi-Turn Interactions via Co-Adaptation [55.938648534942665]
Test-time policy adaptation for multi-turn interactions (T2PAM) is essential for aligning Large Language Models (LLMs) with dynamic user needs during inference time.<n>We propose ROSA2, a framework that reformulates interaction as a joint optimization problem over the heterogeneous space of Words and Weights.
arXiv Detail & Related papers (2026-03-02T02:16:20Z)
Spectral Characterization and Mitigation of Sequential Knowledge Editing Collapse [44.49646322759214]
We show that a model's general abilities are closely associated with dominant singular directions of pretrained weight matrices.<n>We propose REVIVE, a plug-and-play framework that stabilizes sequential editing by explicitly preserving the dominant singular subspace.
arXiv Detail & Related papers (2026-01-16T07:18:14Z)
Forget Many, Forget Right: Scalable and Precise Concept Unlearning in Diffusion Models [17.91843469884079]
ScaPre is a unified framework tailored for large-scale unlearning.<n>It integrates spectral trace regularization and geometry alignment to stabilize optimization, suppress conflicts, and preserve global structure.<n>It forgets up to $times mathbf5$ more concepts than the best baseline within acceptable quality limits.
arXiv Detail & Related papers (2026-01-06T23:59:17Z)
Energy-Regularized Sequential Model Editing on Hyperspheres [59.47007547581175]
Large language models (LLMs) require constant updates to remain aligned with evolving real-world knowledge.<n> sequential editing often destabilizes representations and induces catastrophic forgetting.<n>We propose SPHERE (Sparse Projection for Hyperspherical Energy-Regularized Editing), an HE-driven regularization strategy that stabilizes neuron weight distributions.
arXiv Detail & Related papers (2025-10-01T17:55:43Z)
EntroPE: Entropy-Guided Dynamic Patch Encoder for Time Series Forecasting [50.794700596484894]
We propose EntroPE (Entropy-Guided Dynamic Patch), a novel, temporally informed framework that dynamically detects transition points via conditional entropy.<n>This preserves temporal structure while retaining the computational benefits of patching.<n> Experiments across long-term forecasting benchmarks demonstrate that EntroPE improves both accuracy and efficiency.
arXiv Detail & Related papers (2025-09-30T12:09:56Z)
MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs [82.34547399693966]
Existing methods for lifelong model editing compromise generalization, interfere with past edits, or fail to scale to long editing sequences.<n>We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory.<n>MeMOIR confines each edit to a distinct subset of the memory parameters, minimizing interference among edits.
arXiv Detail & Related papers (2025-06-09T16:16:42Z)
EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Sampler [43.277357306520145]
This paper introduces EmbodiedOcc++, enhancing the original framework with two key innovations.<n>A Geometry-guided Refinement Module (GRM) constrains Gaussian updates through plane regularization, along with a Semantic-aware Uncertainty Sampler (SUS)<n>Experiments on the EmbodiedOcc-ScanNet benchmark demonstrate that EmbodiedOcc achieves state-of-the-art performance across different settings.
arXiv Detail & Related papers (2025-04-13T12:10:49Z)
Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU [50.9588132578029]
This paper investigates machine unlearning in hyperbolic contrastive learning.<n>We adapt Alignment to MERU, a model that embeds images and text in hyperbolic space to better capture semantic hierarchies.<n>Our approach introduces hyperbolic-specific components including entailment calibration and norm regularization that leverage the unique properties of hyperbolic space.
arXiv Detail & Related papers (2025-03-19T12:47:37Z)
Better Call SAUL: Fluent and Consistent Language Model Editing with Generation Regularization [48.07144492109635]
Large language models need to be updated regularly. Model editing is challenging as it might also affect knowledge that is unrelated to the new data. We propose SAUL, a streamlined model editing method that uses sentence concatenation with augmented random facts for generation regularization.
arXiv Detail & Related papers (2024-10-03T12:28:13Z)
From Semantics to Hierarchy: A Hybrid Euclidean-Tangent-Hyperbolic Space Model for Temporal Knowledge Graph Reasoning [1.1372536310854844]
Temporal knowledge graph (TKG) reasoning predicts future events based on historical data. Existing Euclidean models excel at capturing semantics but struggle with hierarchy. We propose a novel hybrid geometric space approach that leverages the strengths of both Euclidean and hyperbolic models.
arXiv Detail & Related papers (2024-08-30T10:33:08Z)
Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.<n>We introduce a novel quantization framework that includes three strategies.<n>This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z)
Robust Hyperbolic Learning with Curvature-Aware Optimization [7.89323764547292]
Current hyperbolic learning approaches are prone to overfitting, computationally expensive, and prone to instability.<n>We introduce a novel fine-tunable hyperbolic scaling approach to constrain hyperbolic embeddings reduce approximation errors.<n>Our approach demonstrates consistent improvements across Computer Vision, EEG classification, and hierarchical metric learning tasks.
arXiv Detail & Related papers (2024-05-22T20:30:14Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action Recognition through Redefined Skeletal Topology Awareness [24.83836008577395]
Graph Convolutional Networks (GCNs) have long defined the state-of-the-art in skeleton-based action recognition. They tend to optimize the adjacency matrix jointly with the model weights. This process causes a gradual decay of bone connectivity data, culminating in a model indifferent to the very topology it sought to map. We propose an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
arXiv Detail & Related papers (2023-05-19T06:40:12Z)
Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE) Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures. Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.