Related papers: Foraging with the Eyes: Dynamics in Human Visual Gaze and Deep Predictive Modeling

Foraging with the Eyes: Dynamics in Human Visual Gaze and Deep Predictive Modeling

URL: http://arxiv.org/abs/2510.09299v1
Date: Fri, 10 Oct 2025 11:45:51 GMT
Title: Foraging with the Eyes: Dynamics in Human Visual Gaze and Deep Predictive Modeling
Authors: Tejaswi V. Panchagnula,
Abstract summary: Animals often forage via Levy walks with heavy tailed step lengths optimized for sparse resource environments.<n>We show that human visual gaze follows similar dynamics when images.<n>Our findings present new evidence that human visual exploration obeys statistical laws to natural foraging and open avenues for modeling gaze through generative and predictive frameworks.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Animals often forage via Levy walks stochastic trajectories with heavy tailed step lengths optimized for sparse resource environments. We show that human visual gaze follows similar dynamics when scanning images. While traditional models emphasize image based saliency, the underlying spatiotemporal statistics of eye movements remain underexplored. Understanding these dynamics has broad applications in attention modeling and vision-based interfaces. In this study, we conducted a large scale human subject experiment involving 40 participants viewing 50 diverse images under unconstrained conditions, recording over 4 million gaze points using a high speed eye tracker. Analysis of these data shows that the gaze trajectory of the human eye also follows a Levy walk akin to animal foraging. This suggests that the human eye forages for visual information in an optimally efficient manner. Further, we trained a convolutional neural network (CNN) to predict fixation heatmaps from image input alone. The model accurately reproduced salient fixation regions across novel images, demonstrating that key components of gaze behavior are learnable from visual structure alone. Our findings present new evidence that human visual exploration obeys statistical laws analogous to natural foraging and open avenues for modeling gaze through generative and predictive frameworks.

Related papers

Human-level 3D shape perception emerges from multi-view learning [63.048728487674815]
We develop a modeling framework that predicts human 3D shape inferences for arbitrary objects.<n>We achieve this with a novel class of neural networks trained using a visual-spatial objective over naturalistic sensory data.<n>We find that human-level 3D perception can emerge from a simple, scalable learning objective over naturalistic visual-spatial data.
arXiv Detail & Related papers (2026-02-19T18:56:05Z)
DiffEye: Diffusion-Based Continuous Eye-Tracking Data Generation Conditioned on Natural Images [24.810828226931605]
DiffEye is a diffusion-based training framework designed to model continuous and diverse eye movement trajectories during free viewing of natural images.<n>By leveraging raw eye-tracking trajectories rather than relying on scanpaths, DiffEye captures the inherent variability in human gaze behavior.<n>The generated trajectories can also be converted into scanpaths and saliency maps, resulting in outputs that more accurately reflect the distribution of human visual attention.
arXiv Detail & Related papers (2025-09-20T18:20:51Z)
Human Gaze Boosts Object-Centered Representation Learning [7.473473243713322]
Recent self-supervised learning models trained on human-like egocentric visual inputs substantially underperform on image recognition tasks compared to humans.<n>Here, we investigate whether focusing on central visual information boosts egocentric visual object learning.<n>Our experiments demonstrate that focusing on central vision leads to better object-centered representations.
arXiv Detail & Related papers (2025-01-06T12:21:40Z)
GazeFusion: Saliency-Guided Image Generation [50.37783903347613]
Diffusion models offer unprecedented image generation power given just a text prompt.<n>We present a saliency-guided framework to incorporate the data priors of human visual attention mechanisms into the generation process.
arXiv Detail & Related papers (2024-03-16T21:01:35Z)
Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images [34.02058539403381]
We leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection. A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images.
arXiv Detail & Related papers (2024-03-13T19:56:30Z)
A domain adaptive deep learning solution for scanpath prediction of paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings. We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans. The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z)
Guiding Visual Attention in Deep Convolutional Neural Networks Based on Human Eye Movements [0.0]
Deep Convolutional Neural Networks (DCNNs) were originally inspired by principles of biological vision. Recent advances in deep learning seem to decrease this similarity. We investigate a purely data-driven approach to obtain useful models.
arXiv Detail & Related papers (2022-06-21T17:59:23Z)
Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises [7.689542442882423]
We designed a dual-stream vision model inspired by the human brain. This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation. We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness.
arXiv Detail & Related papers (2022-06-15T03:44:42Z)
GIMO: Gaze-Informed Human Motion Prediction in Context [75.52839760700833]
We propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, and ego-centric views with eye gaze. Our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects. To realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches.
arXiv Detail & Related papers (2022-04-20T13:17:39Z)
Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame. Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information. Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z)
What Can You Learn from Your Muscles? Learning Visual Representation from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations. Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.