Related papers: BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception

BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception

URL: http://arxiv.org/abs/2402.04541v1
Date: Wed, 7 Feb 2024 02:57:40 GMT
Title: BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception
Authors: Aniket Roy, Anirban Roy, Soma Mitra, Kuntal Ghosh
Abstract summary: We develop a dataset of visual illusions and benchmark using data-driven approach for illusion classification and localization. We consider five types of brightness illusions: 1) Hermann grid, 2) Simultaneous Contrast, 3) White illusion, 4) Grid illusion, and 5) Induced Grating illusion. The application of deep learning model, it is shown, also generalizes over unseen brightness illusions like brightness assimilation to contrast transitions.
Score: 4.685953126232505
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual illusions play a significant role in understanding visual perception. Current methods in understanding and evaluating visual illusions are mostly deterministic filtering based approach and they evaluate on a handful of visual illusions, and the conclusions therefore, are not generic. To this end, we generate a large-scale dataset of 22,366 images (BRI3L: BRightness Illusion Image dataset for Identification and Localization of illusory perception) of the five types of brightness illusions and benchmark the dataset using data-driven neural network based approaches. The dataset contains label information - (1) whether a particular image is illusory/nonillusory, (2) the segmentation mask of the illusory region of the image. Hence, both the classification and segmentation task can be evaluated using this dataset. We follow the standard psychophysical experiments involving human subjects to validate the dataset. To the best of our knowledge, this is the first attempt to develop a dataset of visual illusions and benchmark using data-driven approach for illusion classification and localization. We consider five well-studied types of brightness illusions: 1) Hermann grid, 2) Simultaneous Brightness Contrast, 3) White illusion, 4) Grid illusion, and 5) Induced Grating illusion. Benchmarking on the dataset achieves 99.56% accuracy in illusion identification and 84.37% pixel accuracy in illusion localization. The application of deep learning model, it is shown, also generalizes over unseen brightness illusions like brightness assimilation to contrast transitions. We also test the ability of state-of-theart diffusion models to generate brightness illusions. We have provided all the code, dataset, instructions etc in the github repo: https://github.com/aniket004/BRI3L

Related papers

Do Large Vision-Language Models Distinguish between the Actual and Apparent Features of Illusions? [12.157632635072435]
Humans are susceptible to optical illusions, which serve as valuable tools for investigating sensory and cognitive processes.<n>Research has begun exploring whether machines, such as large vision language models (LVLMs), exhibit similar susceptibilities to visual illusions.
arXiv Detail & Related papers (2025-06-06T05:47:50Z)
Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI [4.58733012283457]
We introduce a novel dataset of optical illusions featuring intermingled animal pairs designed to evoke perceptual ambiguity.<n>We identify generalizable visual concepts, particularly gaze direction and eye cues, as subtle yet impactful features that significantly influence model accuracy.<n>Our findings underscore the importance of concepts in visual learning and provide a foundation for studying bias and alignment between human and machine vision.
arXiv Detail & Related papers (2025-05-27T12:22:59Z)
IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models [56.34742191010987]
Current Visual Language Models (VLMs) show impressive image understanding but struggle with visual illusions. We introduce IllusionBench, a comprehensive visual illusion dataset that encompasses classic cognitive illusions and real-world scene illusions. We design trap illusions that resemble classical patterns but differ in reality, highlighting issues in SOTA models.
arXiv Detail & Related papers (2025-01-01T14:10:25Z)
The Art of Deception: Color Visual Illusions and Diffusion Models [55.830105086695]
Recent studies have shown that artificial neural networks (ANNs) can also be deceived by visual illusions. We show how visual illusions are encoded in diffusion models. We also show how to generate new unseen visual illusions in realistic images using text-to-image diffusion models.
arXiv Detail & Related papers (2024-12-13T13:07:08Z)
Diffusion Illusions: Hiding Images in Plain Sight [37.87050866208039]
Diffusion Illusions is the first comprehensive pipeline designed to automatically generate a wide range of illusions. We study three types of illusions, each where the prime images are arranged in different ways. We conduct comprehensive experiments on these illusions and verify the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-12-06T18:59:18Z)
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? [28.654771227396807]
Vision-Language Models (VLMs) are trained on vast amounts of data captured by humans emulating our understanding of the world. Do VLMs have the similar kind of illusions as humans do, or do they faithfully learn to represent reality? We build a dataset containing five types of visual illusions and formulate four tasks to examine visual illusions in state-of-the-art VLMs.
arXiv Detail & Related papers (2023-10-31T18:01:11Z)
SIDAR: Synthetic Image Dataset for Alignment & Restoration [2.9649783577150837]
There is a lack of datasets that provide enough data to train and evaluate end-to-end deep learning models. Our proposed data augmentation helps to overcome the issue of data scarcity by using 3D rendering. The resulting dataset can serve as a training and evaluation set for a multitude of tasks involving image alignment and artifact removal.
arXiv Detail & Related papers (2023-05-19T23:32:06Z)
Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner. Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z)
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation [54.666329929930455]
We present FFB6D, a Bidirectional fusion network designed for 6D pose estimation from a single RGBD image. We learn to combine appearance and geometry information for representation learning as well as output representation selection. Our method outperforms the state-of-the-art by large margins on several benchmarks.
arXiv Detail & Related papers (2021-03-03T08:07:29Z)
Predictive coding feedback results in perceived illusory contours in a recurrent neural network [0.0]
We equip a deep feedforward convolutional network with brain-inspired recurrent dynamics. We show that the perception of illusory contours could involve feedback connections.
arXiv Detail & Related papers (2021-02-03T09:07:09Z)
Gravitational Models Explain Shifts on Human Visual Attention [80.76475913429357]
Visual attention refers to the human brain's ability to select relevant sensory information for preferential processing. Various methods to estimate saliency have been proposed in the last three decades. We propose a gravitational model (GRAV) to describe the attentional shifts.
arXiv Detail & Related papers (2020-09-15T10:12:41Z)
Visual Chirality [51.685596116645776]
We investigate how statistics of visual data are changed by reflection. Our work has implications for data augmentation, self-supervised learning, and image forensics.
arXiv Detail & Related papers (2020-06-16T20:48:23Z)
Color Visual Illusions: A Statistics-based Computational Model [20.204147875108976]
We introduce a tool that computes the likelihood of patches, given a large dataset to learn from. We present a model that explains lightness and color visual illusions in a unified manner. Our model generates visual illusions in natural images, by applying the same tool, reversely.
arXiv Detail & Related papers (2020-05-18T14:39:48Z)
Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment. We train a specialized global-local network architecture with what would be available to a robot interacting with the environment. Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring. We present a differentiable reblur model for self-supervised motion deblurring. Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.