Related papers: LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation

LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation

URL: http://arxiv.org/abs/2408.02078v1
Date: Sun, 4 Aug 2024 16:09:04 GMT
Title: LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation
Authors: Dwij Mehta, Aditya Mehta, Pratik Narang,
Abstract summary: This paper proposes a novel facial swapping module, termed as LDFaceNet (Latent Diffusion based Face Swapping Network) It is based on a guided latent diffusion model that utilizes facial segmentation and facial recognition modules for a conditioned denoising process. The results of this study demonstrate that the proposed method can generate extremely realistic and coherent images.
Score: 6.866014367868788
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Over the past decade, there has been tremendous progress in the domain of synthetic media generation. This is mainly due to the powerful methods based on generative adversarial networks (GANs). Very recently, diffusion probabilistic models, which are inspired by non-equilibrium thermodynamics, have taken the spotlight. In the realm of image generation, diffusion models (DMs) have exhibited remarkable proficiency in producing both realistic and heterogeneous imagery through their stochastic sampling procedure. This paper proposes a novel facial swapping module, termed as LDFaceNet (Latent Diffusion based Face Swapping Network), which is based on a guided latent diffusion model that utilizes facial segmentation and facial recognition modules for a conditioned denoising process. The model employs a unique loss function to offer directional guidance to the diffusion process. Notably, LDFaceNet can incorporate supplementary facial guidance for desired outcomes without any retraining. To the best of our knowledge, this represents the first application of the latent diffusion model in the face-swapping task without prior training. The results of this study demonstrate that the proposed method can generate extremely realistic and coherent images by leveraging the potential of the diffusion model for facial swapping, thereby yielding superior visual outcomes and greater diversity.

Related papers

Unleashing the Potential of the Semantic Latent Space in Diffusion Models for Image Dehazing [25.138589492384654]
We propose a Diffusion Latent Inspired network for Image Dehazing, dubbed DiffLI$2$D.<n>We first reveal that the semantic latent space of pre-trained diffusion models can represent the content and haze characteristics of images.<n>We integrate the diffusion latent representations at different time-steps into a delicately designed dehazing network to provide instructions for image dehazing.
arXiv Detail & Related papers (2025-09-24T13:11:37Z)
Emergence and Evolution of Interpretable Concepts in Diffusion Models [24.5360032541275]
We use Sparse Autoencoders (SAEs) to probe the inner workings of a popular text-to-image diffusion model. We find that even before the first reverse diffusion step is completed, the final composition of the scene can be predicted surprisingly well. We show that the discovered concepts have a causal effect on the model output and can be leveraged to steer the generative process.
arXiv Detail & Related papers (2025-04-21T22:48:37Z)
INDIGO+: A Unified INN-Guided Probabilistic Diffusion Algorithm for Blind and Non-Blind Image Restoration [22.19661915697775]
We propose a novel INN-guided probabilistic diffusion algorithm for non-blind and blind image restoration. INDIGO and BlindINDIGO combine the merits of the perfect reconstruction property of invertible neural networks (INN) with the strong generative capabilities of pre-trained diffusion models.
arXiv Detail & Related papers (2025-01-23T18:51:52Z)
OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration. We propose OSDFace, a novel one-step diffusion model for face restoration. Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z)
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI) In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion) Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z)
JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement [69.6035373784027]
Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models. Previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy. We propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition.
arXiv Detail & Related papers (2023-12-20T08:05:57Z)
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space. Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings. The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z)
FitDiff: Robust monocular 3D facial shape and reflectance estimation using Diffusion Models [79.65289816077629]
We present FitDiff, a diffusion-based 3D facial avatar generative model. Our model accurately generates relightable facial avatars, utilizing an identity embedding extracted from an "in-the-wild" 2D facial image. Being the first 3D LDM conditioned on face recognition embeddings, FitDiff reconstructs relightable human avatars, that can be used as-is in common rendering engines.
arXiv Detail & Related papers (2023-12-07T17:35:49Z)
Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning [42.009856923352864]
diffusion models have been adopted for behavioral cloning in a sequence modeling fashion. We propose Crossway Diffusion, a simple yet effective method to enhance diffusion-based visuomotor policy learning. Our experiments demonstrate the effectiveness of Crossway Diffusion in various simulated and real-world robot tasks.
arXiv Detail & Related papers (2023-07-04T17:59:29Z)
SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image. It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales. Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.