Related papers: Diffusion Illusions: Hiding Images in Plain Sight

Diffusion Illusions: Hiding Images in Plain Sight

URL: http://arxiv.org/abs/2312.03817v1
Date: Wed, 6 Dec 2023 18:59:18 GMT
Title: Diffusion Illusions: Hiding Images in Plain Sight
Authors: Ryan Burgert, Xiang Li, Abe Leite, Kanchana Ranasinghe, Michael S. Ryoo
Abstract summary: Diffusion Illusions is the first comprehensive pipeline designed to automatically generate a wide range of illusions. We study three types of illusions, each where the prime images are arranged in different ways. We conduct comprehensive experiments on these illusions and verify the effectiveness of our proposed method.
Score: 37.87050866208039
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We explore the problem of computationally generating special `prime' images that produce optical illusions when physically arranged and viewed in a certain way. First, we propose a formal definition for this problem. Next, we introduce Diffusion Illusions, the first comprehensive pipeline designed to automatically generate a wide range of these illusions. Specifically, we both adapt the existing `score distillation loss' and propose a new `dream target loss' to optimize a group of differentially parametrized prime images, using a frozen text-to-image diffusion model. We study three types of illusions, each where the prime images are arranged in different ways and optimized using the aforementioned losses such that images derived from them align with user-chosen text prompts or images. We conduct comprehensive experiments on these illusions and verify the effectiveness of our proposed method qualitatively and quantitatively. Additionally, we showcase the successful physical fabrication of our illusions -- as they are all designed to work in the real world. Our code and examples are publicly available at our interactive project website: https://diffusionillusions.com

Related papers

IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models [56.34742191010987]
Current Visual Language Models (VLMs) show impressive image understanding but struggle with visual illusions. We introduce IllusionBench, a comprehensive visual illusion dataset that encompasses classic cognitive illusions and real-world scene illusions. We design trap illusions that resemble classical patterns but differ in reality, highlighting issues in SOTA models.
arXiv Detail & Related papers (2025-01-01T14:10:25Z)
The Art of Deception: Color Visual Illusions and Diffusion Models [55.830105086695]
Recent studies have shown that artificial neural networks (ANNs) can also be deceived by visual illusions. We show how visual illusions are encoded in diffusion models. We also show how to generate new unseen visual illusions in realistic images using text-to-image diffusion models.
arXiv Detail & Related papers (2024-12-13T13:07:08Z)
Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors [19.58299058678772]
We present a simple yet effective approach for creating 3D multiview illusions based on user-provided text prompts or images. Our method leverages a pre-trained text-to-image diffusion model to optimize the textures and geometry of neural 3D representations. We develop several techniques to improve the quality of the generated 3D multiview illusions.
arXiv Detail & Related papers (2024-12-12T18:59:59Z)
FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image. We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z)
Toward a Diffusion-Based Generalist for Dense Vision Tasks [141.03236279493686]
Recent works have shown image itself can be used as a natural interface for general-purpose visual perception. We propose to perform diffusion in pixel space and provide a recipe for finetuning pre-trained text-to-image diffusion models for dense vision tasks. In experiments, we evaluate our method on four different types of tasks and show competitive performance to the other vision generalists.
arXiv Detail & Related papers (2024-06-29T17:57:22Z)
BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception [4.685953126232505]
We develop a dataset of visual illusions and benchmark using data-driven approach for illusion classification and localization. We consider five types of brightness illusions: 1) Hermann grid, 2) Simultaneous Contrast, 3) White illusion, 4) Grid illusion, and 5) Induced Grating illusion. The application of deep learning model, it is shown, also generalizes over unseen brightness illusions like brightness assimilation to contrast transitions.
arXiv Detail & Related papers (2024-02-07T02:57:40Z)
Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis [60.260724486834164]
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries. We present two key innovations: Vision Guidance and the Layered Rendering Diffusion framework. We apply our method to three practical applications: bounding box-to-image, semantic mask-to-image and image editing.
arXiv Detail & Related papers (2023-11-30T10:36:19Z)
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models [15.977340635967018]
Multi-view optical illusions are images that change appearance upon a transformation, such as a flip or rotation. We propose a zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models. We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method.
arXiv Detail & Related papers (2023-11-29T18:59:59Z)
Diffusion Posterior Illumination for Ambiguity-aware Inverse Rendering [63.24476194987721]
Inverse rendering, the process of inferring scene properties from images, is a challenging inverse problem. Most existing solutions incorporate priors into the inverse-rendering pipeline to encourage plausible solutions. We propose a novel scheme that integrates a denoising probabilistic diffusion model pre-trained on natural illumination maps into an optimization framework.
arXiv Detail & Related papers (2023-09-30T12:39:28Z)
Photo2Relief: Let Human in the Photograph Stand Out [26.102307166656157]
We introduce a sigmoid variant function to manipulate gradients tactfully and train our neural networks by equipping with a loss function defined in gradient domain. To make a clear division of labor in network modules, a two-scale architecture is proposed to create high-quality relief from a single photograph.
arXiv Detail & Related papers (2023-07-21T05:33:57Z)
Learned Spatial Representations for Few-shot Talking-Head Synthesis [68.3787368024951]
We propose a novel approach for few-shot talking-head synthesis. We show that this disentangled representation leads to a significant improvement over previous methods.
arXiv Detail & Related papers (2021-04-29T17:59:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.