An Inter-observer consistent deep adversarial training for visual
scanpath prediction
- URL: http://arxiv.org/abs/2211.07336v2
- Date: Tue, 11 Jul 2023 09:01:00 GMT
- Title: An Inter-observer consistent deep adversarial training for visual
scanpath prediction
- Authors: Mohamed Amine Kerkouri, Marouane Tliba, Aladine Chetouani, Alessandro
Bruno
- Abstract summary: We propose an inter-observer consistent adversarial training approach for scanpath prediction through a lightweight deep neural network.
We show the competitiveness of our approach in regard to state-of-the-art methods.
- Score: 66.46953851227454
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The visual scanpath is a sequence of points through which the human gaze
moves while exploring a scene. It represents the fundamental concepts upon
which visual attention research is based. As a result, the ability to predict
them has emerged as an important task in recent years. In this paper, we
propose an inter-observer consistent adversarial training approach for scanpath
prediction through a lightweight deep neural network. The adversarial method
employs a discriminative neural network as a dynamic loss that is better suited
to model the natural stochastic phenomenon while maintaining consistency
between the distributions related to the subjective nature of scanpaths
traversed by different observers. Through extensive testing, we show the
competitiveness of our approach in regard to state-of-the-art methods.
Related papers
- Identifying Sub-networks in Neural Networks via Functionally Similar Representations [41.028797971427124]
We take a step toward automating the understanding of the network by investigating the existence of distinct sub-networks.
Our approach offers meaningful insights into the behavior of neural networks with minimal human and computational cost.
arXiv Detail & Related papers (2024-10-21T20:19:00Z) - GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths [20.384132849805003]
We introduce GazeXplain, a novel study of visual scanpath prediction and explanation.
This involves annotating natural-language explanations for fixations across eye-tracking datasets.
Experiments on diverse eye-tracking datasets demonstrate the effectiveness of GazeXplain in both scanpath prediction and explanation.
arXiv Detail & Related papers (2024-08-05T19:11:46Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Simulating Human Gaze with Neural Visual Attention [44.65733084492857]
We propose the Neural Visual Attention (NeVA) algorithm to integrate guidance of any downstream visual task into attention modeling.
We observe that biologically constrained neural networks generate human-like scanpaths without being trained for this objective.
arXiv Detail & Related papers (2022-11-22T09:02:09Z) - A Probabilistic Time-Evolving Approach to Scanpath Prediction [8.669748138523758]
We present a probabilistic time-evolving approach to scanpath prediction, based on Bayesian deep learning.
Our model yields results that outperform those of current state-of-the-art approaches, and are almost on par with the human baseline.
arXiv Detail & Related papers (2022-04-20T11:50:29Z) - Behind the Machine's Gaze: Biologically Constrained Neural Networks
Exhibit Human-like Visual Attention [40.878963450471026]
We propose the Neural Visual Attention (NeVA) algorithm to generate visual scanpaths in a top-down manner.
We show that the proposed method outperforms state-of-the-art unsupervised human attention models in terms of similarity to human scanpaths.
arXiv Detail & Related papers (2022-04-19T18:57:47Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Towards Interaction Detection Using Topological Analysis on Neural
Networks [55.74562391439507]
In neural networks, any interacting features must follow a strongly weighted connection to common hidden units.
We propose a new measure for quantifying interaction strength, based upon the well-received theory of persistent homology.
A Persistence Interaction detection(PID) algorithm is developed to efficiently detect interactions.
arXiv Detail & Related papers (2020-10-25T02:15:24Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.