Fréchet Denoised Distance: Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder
- URL: http://arxiv.org/abs/2403.05352v3
- Date: Sat, 02 Nov 2024 16:39:35 GMT
- Title: Fréchet Denoised Distance: Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder
- Authors: Jiajie Fan, Amal Trigui, Thomas Bäck, Hao Wang,
- Abstract summary: We propose to encode to-be-evaluated images with a Denoising Autoencoder (DAE) and measure the distribution distance in the resulting latent space.
Hereby, we design a novel metric Fr'echet Denoised Distance (FDD)
Our FDD can effectively detect implausible structures and is more consistent with structural inspections by human experts.
- Score: 4.619979201312323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A great interest has arisen in using Deep Generative Models (DGM) for generative design. When assessing the quality of the generated designs, human designers focus more on structural plausibility, e.g., no missing component, rather than visual artifacts, e.g., noises or blurriness. Meanwhile, commonly used metrics such as Fr\'echet Inception Distance (FID) may not evaluate accurately because they are sensitive to visual artifacts and tolerant to semantic errors. As such, FID might not be suitable to assess the performance of DGMs for a generative design task. In this work, we propose to encode the to-be-evaluated images with a Denoising Autoencoder (DAE) and measure the distribution distance in the resulting latent space. Hereby, we design a novel metric Fr\'echet Denoised Distance (FDD). We experimentally test our FDD, FID and other state-of-the-art metrics on multiple datasets, e.g., BIKED, Seeing3DChairs, FFHQ and ImageNet. Our FDD can effectively detect implausible structures and is more consistent with structural inspections by human experts. Our source code is publicly available at https://github.com/jiajie96/FDD_pytorch.
Related papers
- A Tilted Seesaw: Revisiting Autoencoder Trade-off for Controllable Diffusion [12.638580946105643]
In latent diffusion models, the autoencoder is typically expected to balance two capabilities: faithful reconstruction and a generation-friendly latent space.<n>In recent ImageNet-scale AE studies, we observe a systematic bias toward generative metrics in handling this trade-off.<n>We analyze why this gFID-dominant preference can appear unproblematic for ImageNet generation, yet becomes risky when scaling to controllable diffusion.
arXiv Detail & Related papers (2026-01-29T12:32:47Z) - Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images [96.43608872116347]
AnomReason is a large-scale benchmark with structured annotations as quadruple textbfAnomAgent<n>AnomReason and AnomAgent serve as a foundation for measuring and improving the semantic plausibility of AI-generated images.
arXiv Detail & Related papers (2025-10-11T14:09:24Z) - Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images.
Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images.
ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z) - Rethinking FID: Towards a Better Evaluation Metric for Image Generation [43.66036053597747]
Inception Distance estimates the distance between a distribution of Inception-v3 features of real images, and those of images generated by the algorithm.
We highlight important drawbacks of FID: Inception's poor representation of the rich and varied content generated by modern text-to-image models, incorrect normality assumptions, and poor sample complexity.
We propose an alternative new metric, CMMD, based on richer CLIP embeddings and the maximum mean discrepancy distance with the Gaussian RBF kernel.
arXiv Detail & Related papers (2023-11-30T19:11:01Z) - Latent Space is Feature Space: Regularization Term for GANs Training on
Limited Dataset [1.8634083978855898]
I proposed an additional structure and loss function for GANs called LFM, trained to maximize the feature diversity between the different dimensions of the latent space.
In experiments, this system has been built upon DCGAN and proved to have improvement on Frechet Inception Distance (FID) training from scratch on CelebA dataset.
arXiv Detail & Related papers (2022-10-28T16:34:48Z) - GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation [70.75100533512021]
In this paper, we formulate the label uncertainty problem as the diversity of potentially plausible bounding boxes of objects.
We propose GLENet, a generative framework adapted from conditional variational autoencoders, to model the one-to-many relationship between a typical 3D object and its potential ground-truth bounding boxes with latent variables.
The label uncertainty generated by GLENet is a plug-and-play module and can be conveniently integrated into existing deep 3D detectors.
arXiv Detail & Related papers (2022-07-06T06:26:17Z) - On the Robustness of Quality Measures for GANs [136.18799984346248]
This work evaluates the robustness of quality measures of generative models such as Inception Score (IS) and Fr'echet Inception Distance (FID)
We show that such metrics can also be manipulated by additive pixel perturbations.
arXiv Detail & Related papers (2022-01-31T06:43:09Z) - Compound Frechet Inception Distance for Quality Assessment of GAN
Created Images [7.628527132779575]
One notable application of GANs is developing fake human faces, also known as "deep fakes"
Measuring the quality of the generated images is inherently subjective but attempts to objectify quality using standardized metrics have been made.
We propose to improve the robustness of the evaluation process by integrating lower-level features to cover a wider array of visual defects.
arXiv Detail & Related papers (2021-06-16T06:53:27Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z) - Deep Continuous Fusion for Multi-Sensor 3D Object Detection [103.5060007382646]
We propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization.
We design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LIDAR feature maps at different levels of resolution.
arXiv Detail & Related papers (2020-12-20T18:43:41Z) - NADS: Neural Architecture Distribution Search for Uncertainty Awareness [79.18710225716791]
Machine learning (ML) systems often encounter Out-of-Distribution (OoD) errors when dealing with testing data coming from a distribution different from training data.
Existing OoD detection approaches are prone to errors and even sometimes assign higher likelihoods to OoD samples.
We propose Neural Architecture Distribution Search (NADS) to identify common building blocks among all uncertainty-aware architectures.
arXiv Detail & Related papers (2020-06-11T17:39:07Z) - Reliable Fidelity and Diversity Metrics for Generative Models [30.941563781926202]
The most widely used metric for measuring the similarity between real and generated images has been the Fr'echet Inception Distance (FID) score.
We show that even the latest version of the precision and recall metrics are not reliable yet.
We propose density and coverage metrics that solve the above issues.
arXiv Detail & Related papers (2020-02-23T00:50:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.