Few-shot Personalized Scanpath Prediction
- URL: http://arxiv.org/abs/2504.05499v1
- Date: Mon, 07 Apr 2025 20:48:41 GMT
- Title: Few-shot Personalized Scanpath Prediction
- Authors: Ruoyu Xue, Jingyi Xu, Sounak Mondal, Hieu Le, Gregory Zelinsky, Minh Hoai, Dimitris Samaras,
- Abstract summary: A personalized model for scanpath prediction provides insights into the visual preferences and attention patterns of individual subjects.<n>Existing methods for training scanpath prediction models are data-intensive and cannot be effectively personalized to new individuals.<n>We propose few-shot personalized scanpath prediction task (FS-PSP) and a novel method to address it.
- Score: 49.20881410145068
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A personalized model for scanpath prediction provides insights into the visual preferences and attention patterns of individual subjects. However, existing methods for training scanpath prediction models are data-intensive and cannot be effectively personalized to new individuals with only a few available examples. In this paper, we propose few-shot personalized scanpath prediction task (FS-PSP) and a novel method to address it, which aims to predict scanpaths for an unseen subject using minimal support data of that subject's scanpath behavior. The key to our method's adaptability is the Subject-Embedding Network (SE-Net), specifically designed to capture unique, individualized representations for each subject's scanpaths. SE-Net generates subject embeddings that effectively distinguish between subjects while minimizing variability among scanpaths from the same individual. The personalized scanpath prediction model is then conditioned on these subject embeddings to produce accurate, personalized results. Experiments on multiple eye-tracking datasets demonstrate that our method excels in FS-PSP settings and does not require any fine-tuning steps at test time. Code is available at: https://github.com/cvlab-stonybrook/few-shot-scanpath
Related papers
- GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths [20.384132849805003]
We introduce GazeXplain, a novel study of visual scanpath prediction and explanation.
This involves annotating natural-language explanations for fixations across eye-tracking datasets.
Experiments on diverse eye-tracking datasets demonstrate the effectiveness of GazeXplain in both scanpath prediction and explanation.
arXiv Detail & Related papers (2024-08-05T19:11:46Z) - EyeFormer: Predicting Personalized Scanpaths with Transformer-Guided Reinforcement Learning [31.583764158565916]
We present EyeFormer, a machine learning model for predicting scanpaths in a visual user interface.
Our model has the unique capability of producing personalized predictions when given a few user scanpath samples.
It can predict full scanpath information, including fixation positions and duration, across individuals and various stimulus types.
arXiv Detail & Related papers (2024-04-15T22:26:27Z) - A Fixed-Point Approach to Unified Prompt-Based Counting [51.20608895374113]
This paper aims to establish a comprehensive prompt-based counting framework capable of generating density maps for objects indicated by various prompt types, such as box, point, and text.
Our model excels in prominent class-agnostic datasets and exhibits superior performance in cross-dataset adaptation tasks.
arXiv Detail & Related papers (2024-03-15T12:05:44Z) - Explicit Visual Prompting for Universal Foreground Segmentations [55.51869354956533]
We present a unified framework for a number of foreground segmentation tasks without any task-specific designs.
We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP.
Our method freezes a pre-trained model and then learns task-specific knowledge using a few extra parameters.
arXiv Detail & Related papers (2023-05-29T11:05:01Z) - Scanpath Prediction in Panoramic Videos via Expected Code Length
Minimization [27.06179638588126]
We present a new criterion for scanpath prediction based on principles from lossy data compression.
This criterion suggests minimizing the expected code length of quantized scanpaths in a training set.
We also introduce a proportional-integral-derivative (PID) controller-based sampler to generate realistic human-like scanpaths.
arXiv Detail & Related papers (2023-05-04T04:10:47Z) - Project and Probe: Sample-Efficient Domain Adaptation by Interpolating
Orthogonal Features [119.22672589020394]
We propose a lightweight, sample-efficient approach that learns a diverse set of features and adapts to a target distribution by interpolating these features.
Our experiments on four datasets, with multiple distribution shift settings for each, show that Pro$2$ improves performance by 5-15% when given limited target data.
arXiv Detail & Related papers (2023-02-10T18:58:03Z) - An Inter-observer consistent deep adversarial training for visual
scanpath prediction [66.46953851227454]
We propose an inter-observer consistent adversarial training approach for scanpath prediction through a lightweight deep neural network.
We show the competitiveness of our approach in regard to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-14T13:22:29Z) - A Probabilistic Time-Evolving Approach to Scanpath Prediction [8.669748138523758]
We present a probabilistic time-evolving approach to scanpath prediction, based on Bayesian deep learning.
Our model yields results that outperform those of current state-of-the-art approaches, and are almost on par with the human baseline.
arXiv Detail & Related papers (2022-04-20T11:50:29Z) - ScanGAN360: A Generative Model of Realistic Scanpaths for 360$^{\circ}$
Images [92.8211658773467]
We present ScanGAN360, a new generative adversarial approach to generate scanpaths for 360$circ$ images.
We accomplish this by leveraging the use of a spherical adaptation of dynamic-time warping as a loss function.
The quality of our scanpaths outperforms competing approaches by a large margin and is almost on par with the human baseline.
arXiv Detail & Related papers (2021-03-25T15:34:18Z) - State-of-the-Art in Human Scanpath Prediction [22.030889583780514]
We evaluate models based on how well they predict each fixation in a scanpath given the previous scanpath history.
This makes model evaluation closely aligned with the biological processes thought to underly scanpath generation.
We evaluate many existing models of scanpath prediction on the datasets MIT1003, MIT300, CAT2000 train and CAT200 test.
arXiv Detail & Related papers (2021-02-24T12:01:28Z) - Self-Supervised Tuning for Few-Shot Segmentation [82.32143982269892]
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.
Existing meta-learning method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space.
This paper presents an adaptive framework tuning, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme.
arXiv Detail & Related papers (2020-04-12T03:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.