Related papers: Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding

Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding

URL: http://arxiv.org/abs/2511.08666v1
Date: Thu, 13 Nov 2025 01:01:45 GMT
Title: Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding
Authors: Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah,
Abstract summary: We introduce a novel formulation of visual privacy preservation for video foundation models that operates entirely in the latent space.<n>Current privacy preservation methods on input-pixel-level anonymization require retraining the entire utility video model.<n>A lightweight Anonym Adapter Module (AAM) removes private information from video features while retaining general task utility.
Score: 56.369026347458835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a novel formulation of visual privacy preservation for video foundation models that operates entirely in the latent space. While spatio-temporal features learned by foundation models have deepened general understanding of video content, sharing or storing these extracted visual features for downstream tasks inadvertently reveals sensitive personal information like skin color, gender, or clothing. Current privacy preservation methods focus on input-pixel-level anonymization, which requires retraining the entire utility video model and results in task-specific anonymization, making them unsuitable for recent video foundational models. To address these challenges, we introduce a lightweight Anonymizing Adapter Module (AAM) that removes private information from video features while retaining general task utility. AAM can be applied in a plug-and-play fashion to frozen video encoders, minimizing the computational burden of finetuning and re-extracting features. Our framework employs three newly designed training objectives: (1) a clip-level self-supervised privacy objective to reduce mutual information between static clips, (2) a co-training objective to retain utility across seen tasks, and (3) a latent consistency loss for generalization on unseen tasks. Our extensive evaluations demonstrate a significant 35% reduction in privacy leakage while maintaining near-baseline utility performance across various downstream tasks: Action Recognition (Kinetics400, UCF101, HMDB51), Temporal Action Detection (THUMOS14), and Anomaly Detection (UCF-Crime). We also provide an analysis on anonymization for sensitive temporal attribute recognition. Additionally, we propose new protocols for assessing gender bias in action recognition models, showing that our method effectively mitigates such biases and promotes more equitable video understanding.

Related papers

Evaluation of Vision-LLMs in Surveillance Video [8.750453732584491]
This paper investigates the spatial reasoning of vision-language models (VLMs)<n>It addresses the embodied perception challenge of interpreting dynamic 3D scenes from sparse 2D video.<n>We evaluate four open models on UCF-Crime and RWF-2000 under prompting and privacy-preserving conditions.
arXiv Detail & Related papers (2025-10-27T10:27:02Z)
Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking [90.81846867441993]
This paper presents the first investigation on preventing personal video data from unauthorized exploitation by deep trackers.<n>We propose a novel generative framework for generating Temporal Unlearnable Examples (TUEs)<n>Our approach achieves state-of-the-art performance in video data-privacy protection, with strong transferability across VOT models, datasets, and temporal matching tasks.
arXiv Detail & Related papers (2025-07-10T07:11:33Z)
PV-VTT: A Privacy-Centric Dataset for Mission-Specific Anomaly Detection and Natural Language Interpretation [5.0923114224599555]
We present PV-VTT (Privacy Violation Video To Text), a unique multimodal dataset aimed at identifying privacy violations.<n> PV-VTT provides detailed annotations for both video and text in scenarios.<n>This privacy-focused approach allows researchers to use the dataset while protecting participant confidentiality.
arXiv Detail & Related papers (2024-10-30T01:02:20Z)
Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance [5.78828936452823]
This study revisits conventional anonymization solutions for privacy protection and real-time video anomaly detection applications. We propose a novel lightweight adaptive anonymization for VAD (LA3D) that employs dynamic adjustment to enhance privacy protection. Our experiment demonstrates that LA3D enables substantial improvement in the privacy anonymization capability without majorly degrading VAD efficacy.
arXiv Detail & Related papers (2024-10-24T13:22:33Z)
Diff-Privacy: Diffusion-based Face Privacy Protection [58.1021066224765]
In this paper, we propose a novel face privacy protection method based on diffusion models, dubbed Diff-Privacy. Specifically, we train our proposed multi-scale image inversion module (MSI) to obtain a set of SDM format conditional embeddings of the original image. Based on the conditional embeddings, we design corresponding embedding scheduling strategies and construct different energy functions during the denoising process to achieve anonymization and visual identity information hiding.
arXiv Detail & Related papers (2023-09-11T09:26:07Z)
TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task. Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information. We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z)
Privacy-Preserving Action Recognition via Motion Difference Quantization [22.31448780032675]
This paper proposes a simple, yet robust privacy-preserving encoder called BDQ. It is composed of three modules: Blur, Difference, and Quantization. Experiments on three benchmark datasets show that the proposed encoder design can achieve state-of-the-art trade-off.
arXiv Detail & Related papers (2022-08-04T05:03:27Z)
OPOM: Customized Invisible Cloak towards Face Privacy Protection [58.07786010689529]
We investigate the face privacy protection from a technology standpoint based on a new type of customized cloak. We propose a new method, named one person one mask (OPOM), to generate person-specific (class-wise) universal masks. The effectiveness of the proposed method is evaluated on both common and celebrity datasets.
arXiv Detail & Related papers (2022-05-24T11:29:37Z)
SPAct: Self-supervised Privacy Preservation for Action Recognition [73.79886509500409]
Existing approaches for mitigating privacy leakage in action recognition require privacy labels along with the action labels from the video dataset. Recent developments of self-supervised learning (SSL) have unleashed the untapped potential of the unlabeled data. We present a novel training framework which removes privacy information from input video in a self-supervised manner without requiring privacy labels.
arXiv Detail & Related papers (2022-03-29T02:56:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.