The Art of Deception: Color Visual Illusions and Diffusion Models
- URL: http://arxiv.org/abs/2412.10122v1
- Date: Fri, 13 Dec 2024 13:07:08 GMT
- Title: The Art of Deception: Color Visual Illusions and Diffusion Models
- Authors: Alex Gomez-Villa, Kai Wang, Alejandro C. Parraga, Bartlomiej Twardowski, Jesus Malo, Javier Vazquez-Corral, Joost van de Weijer,
- Abstract summary: Recent studies have shown that artificial neural networks (ANNs) can also be deceived by visual illusions.
We show how visual illusions are encoded in diffusion models.
We also show how to generate new unseen visual illusions in realistic images using text-to-image diffusion models.
- Score: 55.830105086695
- License:
- Abstract: Visual illusions in humans arise when interpreting out-of-distribution stimuli: if the observer is adapted to certain statistics, perception of outliers deviates from reality. Recent studies have shown that artificial neural networks (ANNs) can also be deceived by visual illusions. This revelation raises profound questions about the nature of visual information. Why are two independent systems, both human brains and ANNs, susceptible to the same illusions? Should any ANN be capable of perceiving visual illusions? Are these perceptions a feature or a flaw? In this work, we study how visual illusions are encoded in diffusion models. Remarkably, we show that they present human-like brightness/color shifts in their latent space. We use this fact to demonstrate that diffusion models can predict visual illusions. Furthermore, we also show how to generate new unseen visual illusions in realistic images using text-to-image diffusion models. We validate this ability through psychophysical experiments that show how our model-generated illusions also fool humans.
Related papers
- IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models [56.34742191010987]
Current Visual Language Models (VLMs) show impressive image understanding but struggle with visual illusions.
We introduce IllusionBench, a comprehensive visual illusion dataset that encompasses classic cognitive illusions and real-world scene illusions.
We design trap illusions that resemble classical patterns but differ in reality, highlighting issues in SOTA models.
arXiv Detail & Related papers (2025-01-01T14:10:25Z) - Evaluating Model Perception of Color Illusions in Photorealistic Scenes [16.421832484760987]
We study the perception of color illusions by vision-language models.
We propose an automated framework for generating color illusion images.
Experiments show that all studied VLMs exhibit perceptual biases similar human vision.
arXiv Detail & Related papers (2024-12-09T03:49:10Z) - The Illusion-Illusion: Vision Language Models See Illusions Where There are None [0.0]
I show that many current vision language systems mistakenly see illusory-illusions as illusions.
I suggest that such failures are part of broader failures already discussed in the literature.
arXiv Detail & Related papers (2024-12-07T03:30:51Z) - Toward a Diffusion-Based Generalist for Dense Vision Tasks [141.03236279493686]
Recent works have shown image itself can be used as a natural interface for general-purpose visual perception.
We propose to perform diffusion in pixel space and provide a recipe for finetuning pre-trained text-to-image diffusion models for dense vision tasks.
In experiments, we evaluate our method on four different types of tasks and show competitive performance to the other vision generalists.
arXiv Detail & Related papers (2024-06-29T17:57:22Z) - Grounding Visual Illusions in Language: Do Vision-Language Models
Perceive Illusions Like Humans? [28.654771227396807]
Vision-Language Models (VLMs) are trained on vast amounts of data captured by humans emulating our understanding of the world.
Do VLMs have the similar kind of illusions as humans do, or do they faithfully learn to represent reality?
We build a dataset containing five types of visual illusions and formulate four tasks to examine visual illusions in state-of-the-art VLMs.
arXiv Detail & Related papers (2023-10-31T18:01:11Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Evolutionary Generation of Visual Motion Illusions [0.0]
We present a generative model, the Evolutionary Illusion GENerator (EIGen), that creates new visual motion illusions.
The structure of EIGen supports the hypothesis that illusory motion might be the result of perceiving the brain's own predictions.
The scientific motivation of this paper is to demonstrate that the perception of illusory motion could be a side effect of the predictive abilities of the brain.
arXiv Detail & Related papers (2021-12-25T14:53:50Z) - Visual Chirality [51.685596116645776]
We investigate how statistics of visual data are changed by reflection.
Our work has implications for data augmentation, self-supervised learning, and image forensics.
arXiv Detail & Related papers (2020-06-16T20:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.