Text-image guided Diffusion Model for generating Deepfake celebrity
interactions
- URL: http://arxiv.org/abs/2309.14751v1
- Date: Tue, 26 Sep 2023 08:24:37 GMT
- Title: Text-image guided Diffusion Model for generating Deepfake celebrity
interactions
- Authors: Yunzhuo Chen, Nur Al Hasan Haldar, Naveed Akhtar, Ajmal Mian
- Abstract summary: Diffusion models have recently demonstrated highly realistic visual content generation.
This paper devises and explores a novel method in that regard.
Our results show that with the devised scheme, it is possible to create fake visual content with alarming realism.
- Score: 50.37578424163951
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deepfake images are fast becoming a serious concern due to their realism.
Diffusion models have recently demonstrated highly realistic visual content
generation, which makes them an excellent potential tool for Deepfake
generation. To curb their exploitation for Deepfakes, it is imperative to first
explore the extent to which diffusion models can be used to generate realistic
content that is controllable with convenient prompts. This paper devises and
explores a novel method in that regard. Our technique alters the popular stable
diffusion model to generate a controllable high-quality Deepfake image with
text and image prompts. In addition, the original stable model lacks severely
in generating quality images that contain multiple persons. The modified
diffusion model is able to address this problem, it add input anchor image's
latent at the beginning of inferencing rather than Gaussian random latent as
input. Hence, we focus on generating forged content for celebrity interactions,
which may be used to spread rumors. We also apply Dreambooth to enhance the
realism of our fake images. Dreambooth trains the pairing of center words and
specific features to produce more refined and personalized output images. Our
results show that with the devised scheme, it is possible to create fake visual
content with alarming realism, such that the content can serve as believable
evidence of meetings between powerful political figures.
Related papers
- Improving face generation quality and prompt following with synthetic captions [57.47448046728439]
We introduce a training-free pipeline designed to generate accurate appearance descriptions from images of people.
We then use these synthetic captions to fine-tune a text-to-image diffusion model.
Our results demonstrate that this approach significantly improves the model's ability to generate high-quality, realistic human faces.
arXiv Detail & Related papers (2024-05-17T15:50:53Z) - Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images [34.02058539403381]
We leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection.
A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images.
arXiv Detail & Related papers (2024-03-13T19:56:30Z) - Improving Diffusion Models for Authentic Virtual Try-on in the Wild [53.96244595495942]
This paper considers image-based virtual try-on, which renders an image of a person wearing a curated garment.
We propose a novel diffusion model that improves garment fidelity and generates authentic virtual try-on images.
We present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
arXiv Detail & Related papers (2024-03-08T08:12:18Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with
Large Language Models [56.88192537044364]
We propose a simple-yet-effective parameter-efficient fine-tuning approach called the Semantic Understanding and Reasoning adapter (SUR-adapter) for pre-trained diffusion models.
Our approach can make text-to-image diffusion models easier to use with better user experience.
arXiv Detail & Related papers (2023-05-09T05:48:38Z) - SINE: SINgle Image Editing with Text-to-Image Diffusion Models [10.67527134198167]
This work aims to address the problem of single-image editing.
We propose a novel model-based guidance built upon the classifier-free guidance.
We show promising editing capabilities, including changing style, content addition, and object manipulation.
arXiv Detail & Related papers (2022-12-08T18:57:13Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - On the detection of synthetic images generated by diffusion models [18.12766911229293]
Methods based on diffusion models (DM) have been gaining the spotlight.
DM enables the creation of text-based visual content.
Malicious users can generate and distribute fake media perfectly adapted to their attacks.
arXiv Detail & Related papers (2022-11-01T18:10:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.