HyperDet: Generalizable Detection of Synthesized Images by Generating and Merging A Mixture of Hyper LoRAs
- URL: http://arxiv.org/abs/2410.06044v1
- Date: Tue, 8 Oct 2024 13:43:01 GMT
- Title: HyperDet: Generalizable Detection of Synthesized Images by Generating and Merging A Mixture of Hyper LoRAs
- Authors: Huangsen Cao, Yongwei Wang, Yinfeng Liu, Sixian Zheng, Kangtao Lv, Zhimeng Zhang, Bo Zhang, Xin Ding, Fei Wu,
- Abstract summary: We introduce a novel and generalizable detection framework termed HyperDet.
In this work, we propose a novel objective function that balances the pixel and semantic artifacts effectively.
Our work paves a new way to establish generalizable domain-specific fake image detectors based on pretrained large vision models.
- Score: 17.88153857572688
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emergence of diverse generative vision models has recently enabled the synthesis of visually realistic images, underscoring the critical need for effectively detecting these generated images from real photos. Despite advances in this field, existing detection approaches often struggle to accurately identify synthesized images generated by different generative models. In this work, we introduce a novel and generalizable detection framework termed HyperDet, which innovatively captures and integrates shared knowledge from a collection of functionally distinct and lightweight expert detectors. HyperDet leverages a large pretrained vision model to extract general detection features while simultaneously capturing and enhancing task-specific features. To achieve this, HyperDet first groups SRM filters into five distinct groups to efficiently capture varying levels of pixel artifacts based on their different functionality and complexity. Then, HyperDet utilizes a hypernetwork to generate LoRA model weights with distinct embedding parameters. Finally, we merge the LoRA networks to form an efficient model ensemble. Also, we propose a novel objective function that balances the pixel and semantic artifacts effectively. Extensive experiments on the UnivFD and Fake2M datasets demonstrate the effectiveness of our approach, achieving state-of-the-art performance. Moreover, our work paves a new way to establish generalizable domain-specific fake image detectors based on pretrained large vision models.
Related papers
- Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending [54.26862913139299]
We introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB)
TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models.
Experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
arXiv Detail & Related papers (2024-09-17T07:52:09Z) - Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection.
CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z) - Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images [13.089550724738436]
Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields.
Their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content.
This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier.
arXiv Detail & Related papers (2024-04-19T14:30:41Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - A Dual Attentive Generative Adversarial Network for Remote Sensing Image
Change Detection [6.906936669510404]
We propose a dual attentive generative adversarial network for achieving very high-resolution remote sensing image change detection tasks.
The DAGAN framework has better performance with 85.01% mean IoU and 91.48% mean F1 score than advanced methods on the LEVIR dataset.
arXiv Detail & Related papers (2023-10-03T08:26:27Z) - HyperINR: A Fast and Predictive Hypernetwork for Implicit Neural
Representations via Knowledge Distillation [31.44962361819199]
Implicit Neural Representations (INRs) have recently exhibited immense potential in the field of scientific visualization.
In this paper, we introduce HyperINR, a novel hypernetwork architecture capable of directly predicting the weights for a compact INR.
By harnessing an ensemble of multiresolution hash encoding units in unison, the resulting INR attains state-of-the-art inference performance.
arXiv Detail & Related papers (2023-04-09T08:10:10Z) - GH-Feat: Learning Versatile Generative Hierarchical Features from GANs [61.208757845344074]
We show that a generative feature learned from image synthesis exhibits great potentials in solving a wide range of computer vision tasks.
We first train an encoder by considering the pretrained StyleGAN generator as a learned loss function.
The visual features produced by our encoder, termed as Generative Hierarchical Features (GH-Feat), highly align with the layer-wise GAN representations.
arXiv Detail & Related papers (2023-01-12T21:59:46Z) - Fusing Global and Local Features for Generalized AI-Synthesized Image
Detection [31.35052580048599]
We design a two-branch model to combine global spatial information from the whole image and local informative features from patches selected by a novel patch selection module.
We collect a highly diverse dataset synthesized by 19 models with various objects and resolutions to evaluate our model.
arXiv Detail & Related papers (2022-03-26T01:55:37Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Generating Annotated High-Fidelity Images Containing Multiple Coherent
Objects [10.783993190686132]
We propose a multi-object generation framework that can synthesize images with multiple objects without explicitly requiring contextual information.
We demonstrate how coherency and fidelity are preserved with our method through experiments on the Multi-MNIST and CLEVR datasets.
arXiv Detail & Related papers (2020-06-22T11:33:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.