Temporally coherent video anonymization through GAN inpainting
- URL: http://arxiv.org/abs/2106.02328v1
- Date: Fri, 4 Jun 2021 08:19:44 GMT
- Title: Temporally coherent video anonymization through GAN inpainting
- Authors: Thangapavithraa Balaji, Patrick Blies, Georg G\"ori, Raphael Mitsch,
Marcel Wasserer, Torsten Sch\"on
- Abstract summary: This work tackles the problem of temporally coherent face anonymization in natural video streams.
We propose JaGAN, a two-stage system starting with detecting and masking out faces with black image patches in all individual frames of the video.
Our initial experiments reveal that image based generative models are not capable of inpainting patches showing temporal coherent appearance across neighboring video frames.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work tackles the problem of temporally coherent face anonymization in
natural video streams.We propose JaGAN, a two-stage system starting with
detecting and masking out faces with black image patches in all individual
frames of the video. The second stage leverages a privacy-preserving Video
Generative Adversarial Network designed to inpaint the missing image patches
with artificially generated faces. Our initial experiments reveal that image
based generative models are not capable of inpainting patches showing temporal
coherent appearance across neighboring video frames. To address this issue we
introduce a newly curated video collection, which is made publicly available
for the research community along with this paper. We also introduce the
Identity Invariance Score IdI as a means to quantify temporal coherency between
neighboring frames.
Related papers
- DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with
Diffusion Auto-encoder [21.405442790474268]
We propose DiffDub: Diffusion-based dubbing.
We first craft the Diffusion auto-encoder by an inpainting incorporating a mask to delineate editable zones and unaltered regions.
To tackle these issues, we employ versatile strategies, including data augmentation and supplementary eye guidance.
arXiv Detail & Related papers (2023-11-03T09:41:51Z) - RIGID: Recurrent GAN Inversion and Editing of Real Face Videos [73.97520691413006]
GAN inversion is indispensable for applying the powerful editability of GAN to real images.
Existing methods invert video frames individually often leading to undesired inconsistent results over time.
We propose a unified recurrent framework, named textbfRecurrent vtextbfIdeo textbfGAN textbfInversion and etextbfDiting (RIGID)
Our framework learns the inherent coherence between input frames in an end-to-end manner.
arXiv Detail & Related papers (2023-08-11T12:17:24Z) - Siamese Masked Autoencoders [76.35448665609998]
We present Siamese Masked Autoencoders (SiamMAE) for learning visual correspondence from videos.
SiamMAE operates on pairs of randomly sampled video frames and asymmetrically masks them.
It outperforms state-of-the-art self-supervised methods on video object segmentation, pose keypoint propagation, and semantic part propagation tasks.
arXiv Detail & Related papers (2023-05-23T17:59:46Z) - Towards Smooth Video Composition [59.134911550142455]
Video generation requires consistent and persistent frames with dynamic content over time.
This work investigates modeling the temporal relations for composing video with arbitrary length, from a few frames to even infinite, using generative adversarial networks (GANs)
We show that the alias-free operation for single image generation, together with adequately pre-learned knowledge, brings a smooth frame transition without compromising the per-frame quality.
arXiv Detail & Related papers (2022-12-14T18:54:13Z) - Facial Expression Video Generation Based-On Spatio-temporal
Convolutional GAN: FEV-GAN [1.279257604152629]
We present a novel approach for generating videos of the six basic facial expressions.
Our approach is based on Spatio-temporal Conal GANs, that are known to model both content and motion in the same network.
The code and the pre-trained model will soon be made publicly available.
arXiv Detail & Related papers (2022-10-20T11:54:32Z) - Video2StyleGAN: Encoding Video in Latent Space for Manipulation [63.03250800510085]
We propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation.
Our approach can significantly outperform existing single image methods, while achieving real-time (66 fps) speed.
arXiv Detail & Related papers (2022-06-27T06:48:15Z) - UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video
Editing [78.26925404508994]
We propose a unified temporally consistent facial video editing framework termed UniFaceGAN.
Our framework is designed to handle face swapping and face reenactment simultaneously.
Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2021-08-12T10:35:22Z) - Learning Joint Spatial-Temporal Transformations for Video Inpainting [58.939131620135235]
We propose to learn a joint Spatial-Temporal Transformer Network (STTN) for video inpainting.
We simultaneously fill missing regions in all input frames by self-attention, and propose to optimize STTN by a spatial-temporal adversarial loss.
arXiv Detail & Related papers (2020-07-20T16:35:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.