TraceHiding: Scalable Machine Unlearning for Mobility Data
- URL: http://arxiv.org/abs/2509.17241v1
- Date: Sun, 21 Sep 2025 21:29:24 GMT
- Title: TraceHiding: Scalable Machine Unlearning for Mobility Data
- Authors: Ali Faraji, Manos Papagelis,
- Abstract summary: TraceHiding is a scalable, importance-aware machine unlearning framework for mobility trajectory data.<n>It removes specified user trajectories from trained deep models without full retraining.<n>It achieves superior unlearning accuracy, competitive membership inference attack (MIA) resilience, and up to 40times speedup over retraining.
- Score: 1.6042394978941517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work introduces TraceHiding, a scalable, importance-aware machine unlearning framework for mobility trajectory data. Motivated by privacy regulations such as GDPR and CCPA granting users "the right to be forgotten," TraceHiding removes specified user trajectories from trained deep models without full retraining. It combines a hierarchical data-driven importance scoring scheme with teacher-student distillation. Importance scores--computed at token, trajectory, and user levels from statistical properties (coverage diversity, entropy, length)--quantify each training sample's impact, enabling targeted forgetting of high-impact data while preserving common patterns. The student model retains knowledge on remaining data and unlearns targeted trajectories through an importance-weighted loss that amplifies forgetting signals for unique samples and attenuates them for frequent ones. We validate on Trajectory--User Linking (TUL) tasks across three real-world higher-order mobility datasets (HO-Rome, HO-Geolife, HO-NYC) and multiple architectures (GRU, LSTM, BERT, ModernBERT, GCN-TULHOR), against strong unlearning baselines including SCRUB, NegGrad, NegGrad+, Bad-T, and Finetuning. Experiments under uniform and targeted user deletion show TraceHiding, especially its entropy-based variant, achieves superior unlearning accuracy, competitive membership inference attack (MIA) resilience, and up to 40\times speedup over retraining with minimal test accuracy loss. Results highlight robustness to adversarial deletion of high-information users and consistent performance across models. To our knowledge, this is the first systematic study of machine unlearning for trajectory data, providing a reproducible pipeline with public code and preprocessing tools.
Related papers
- REMIND: Input Loss Landscapes Reveal Residual Memorization in Post-Unlearning LLMs [0.1784233255402269]
Machine unlearning aims to remove the influence of specific training data from a model without requiring full retraining.<n>We propose REMIND, a novel evaluation method aiming to detect the subtle remaining influence of unlearned data.<n>We show that unlearned data yield flatter, less steep loss landscapes, while retained or unrelated data exhibit sharper, more volatile patterns.
arXiv Detail & Related papers (2025-11-06T09:58:19Z) - ReTrack: Data Unlearning in Diffusion Models through Redirecting the Denoising Trajectory [17.016094185289372]
We propose ReTrack, a fast and effective data unlearning method for diffusion models.<n>ReTrack employs importance sampling to construct a more efficient fine-tuning loss.<n>Experiments on MNIST T-Shirt, CelebA-HQ, CIFAR-10, and Stable Diffusion show that ReTrack achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-09-16T12:20:15Z) - AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.<n>Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.<n>We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z) - Erase then Rectify: A Training-Free Parameter Editing Approach for Cost-Effective Graph Unlearning [17.85404473268992]
Graph unlearning aims to eliminate the influence of nodes, edges, or attributes from a trained Graph Neural Network (GNN)<n>Existing graph unlearning techniques often necessitate additional training on the remaining data, leading to significant computational costs.<n>We propose a two-stage training-free approach, Erase then Rectify (ETR), designed for efficient and scalable graph unlearning.
arXiv Detail & Related papers (2024-09-25T07:20:59Z) - Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs [25.91643745340183]
Large Language Models (LLMs) have demonstrated strong reasoning and memorization capabilities via pretraining on massive textual corpora.<n>This poses risk of privacy and copyright violations, highlighting the need for efficient machine unlearning methods.<n>We propose Low-rank Knowledge Unlearning (LoKU), a novel framework that enables robust and efficient unlearning for LLMs.
arXiv Detail & Related papers (2024-08-13T04:18:32Z) - Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective [4.31734012105466]
Machine Unlearning is the process of selectively discarding information designated to specific sets or classes of data from a pre-trained model.
We propose a methodology tailored for the purposeful elimination of information linked to a specific class of data from a pre-trained classification network.
Our novel approach, termed textbfPartially-Blinded Unlearning (PBU), surpasses existing state-of-the-art class unlearning methods, demonstrating superior effectiveness.
arXiv Detail & Related papers (2024-03-24T17:33:22Z) - GraphGuard: Detecting and Counteracting Training Data Misuse in Graph
Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training.
Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models.
This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z) - Fast Machine Unlearning Without Retraining Through Selective Synaptic
Dampening [51.34904967046097]
Selective Synaptic Dampening (SSD) is a fast, performant, and does not require long-term storage of the training data.
We present a novel two-step, post hoc, retrain-free approach to machine unlearning which is fast, performant, and does not require long-term storage of the training data.
arXiv Detail & Related papers (2023-08-15T11:30:45Z) - Unsupervised Noisy Tracklet Person Re-identification [100.85530419892333]
We present a novel selective tracklet learning (STL) approach that can train discriminative person re-id models from unlabelled tracklet data.
This avoids the tedious and costly process of exhaustively labelling person image/tracklet true matching pairs across camera views.
Our method is particularly more robust against arbitrary noisy data of raw tracklets therefore scalable to learning discriminative models from unconstrained tracking data.
arXiv Detail & Related papers (2021-01-16T07:31:00Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.