Leveraging Geometric Visual Illusions as Perceptual Inductive Biases for Vision Models
- URL: http://arxiv.org/abs/2509.15156v1
- Date: Thu, 18 Sep 2025 17:00:42 GMT
- Title: Leveraging Geometric Visual Illusions as Perceptual Inductive Biases for Vision Models
- Authors: Haobo Yang, Minghao Guo, Dequan Yang, Wenyu Wang,
- Abstract summary: We introduce a synthetic, parametric geometric-illusion dataset and evaluate three multi-source learning strategies that combine illusion recognition tasks with ImageNet classification objectives.<n>Our experiments reveal two key conceptual insights: (i) incorporating geometric illusions as auxiliary supervision systematically improves generalization, especially in visually challenging cases involving intricate contours and fine textures.<n>These results demonstrate a novel integration of perceptual science and machine learning and suggest new directions for embedding perceptual priors into vision model design.
- Score: 15.629707528331672
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Contemporary deep learning models have achieved impressive performance in image classification by primarily leveraging statistical regularities within large datasets, but they rarely incorporate structured insights drawn directly from perceptual psychology. To explore the potential of perceptually motivated inductive biases, we propose integrating classic geometric visual illusions well-studied phenomena from human perception into standard image-classification training pipelines. Specifically, we introduce a synthetic, parametric geometric-illusion dataset and evaluate three multi-source learning strategies that combine illusion recognition tasks with ImageNet classification objectives. Our experiments reveal two key conceptual insights: (i) incorporating geometric illusions as auxiliary supervision systematically improves generalization, especially in visually challenging cases involving intricate contours and fine textures; and (ii) perceptually driven inductive biases, even when derived from synthetic stimuli traditionally considered unrelated to natural image recognition, can enhance the structural sensitivity of both CNN and transformer-based architectures. These results demonstrate a novel integration of perceptual science and machine learning and suggest new directions for embedding perceptual priors into vision model design.
Related papers
- Human-level 3D shape perception emerges from multi-view learning [63.048728487674815]
We develop a modeling framework that predicts human 3D shape inferences for arbitrary objects.<n>We achieve this with a novel class of neural networks trained using a visual-spatial objective over naturalistic sensory data.<n>We find that human-level 3D perception can emerge from a simple, scalable learning objective over naturalistic visual-spatial data.
arXiv Detail & Related papers (2026-02-19T18:56:05Z) - Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment [51.40989269202702]
aesthetic quality assessment task is crucial for developing a human-aligned quantitative evaluation system for AIGC.<n>We propose ArtQuant, an aesthetics assessment framework for artistic images which couples isolated aesthetic dimensions through description generation.<n>Our approach achieves epoch state-of-the-art performance on several datasets while requiring only 33% of conventional trainings.
arXiv Detail & Related papers (2025-12-29T12:18:26Z) - From Images to Perception: Emergence of Perceptual Properties by Reconstructing Images [1.77513002450736]
A bio-inspired architecture that can accommodate several known facts in the retina-V1 cortex, the PerceptNet, has been end-to-end optimized for different tasks related to image reconstruction.<n>Our results show that the encoder stage consistently exhibits the highest correlation with human perceptual judgments on image distortion.
arXiv Detail & Related papers (2025-08-14T08:37:30Z) - Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning [18.13538667261998]
Current multi-modal models exhibit a notable misalignment with the human visual system when identifying objects that are visually assimilated into the background.<n>We build a visual system that mimicks human visual camouflaged perception to progressively and iteratively refocus' visual concealed content.
arXiv Detail & Related papers (2025-05-26T07:27:18Z) - Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field [85.12359852781216]
This survey presents a systematic analysis of over 200 papers focused on dynamic scene representation using radiance field.<n>We organize diverse methodological approaches under a unified representational framework, concluding with a critical examination of persistent challenges and promising research directions.
arXiv Detail & Related papers (2025-05-15T07:51:08Z) - Visual Image Reconstruction from Brain Activity via Latent Representation [0.0]
Review traces the field's evolution from early classification approaches to sophisticated reconstructions.<n>We discuss the need for diverse datasets and refined evaluation metrics aligned with human perceptual judgments.<n>Visual image reconstruction offers promising insights into neural coding and enables new psychological measurements of visual experiences.
arXiv Detail & Related papers (2025-05-13T10:46:52Z) - Convolution goes higher-order: a biologically inspired mechanism empowers image classification [0.8999666725996975]
We propose a novel approach to image classification inspired by complex nonlinear biological visual processing.<n>Our model incorporates a Volterra-like expansion of the convolution operator, capturing multiplicative interactions.<n>Our work bridges neuroscience and deep learning, offering a path towards more effective, biologically inspired computer vision models.
arXiv Detail & Related papers (2024-12-09T18:33:09Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) seeks to recover high-quality images from degraded observations caused by a wide range of factors, including noise, blur, compression, and adverse weather.<n>Traditional IR methods have made notable progress by targeting individual degradation types, but their specialization often comes at the cost of generalization.<n>The all-in-one image restoration paradigm has recently emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability.
We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks.
Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.