A domain adaptive deep learning solution for scanpath prediction of
paintings
- URL: http://arxiv.org/abs/2209.11338v1
- Date: Thu, 22 Sep 2022 22:27:08 GMT
- Title: A domain adaptive deep learning solution for scanpath prediction of
paintings
- Authors: Mohamed Amine Kerkouri, Marouane Tliba, Aladine Chetouani, Alessandro
Bruno
- Abstract summary: This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
- Score: 66.46953851227454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cultural heritage understanding and preservation is an important issue for
society as it represents a fundamental aspect of its identity. Paintings
represent a significant part of cultural heritage, and are the subject of study
continuously. However, the way viewers perceive paintings is strictly related
to the so-called HVS (Human Vision System) behaviour. This paper focuses on the
eye-movement analysis of viewers during the visual experience of a certain
number of paintings. In further details, we introduce a new approach to
predicting human visual attention, which impacts several cognitive functions
for humans, including the fundamental understanding of a scene, and then extend
it to painting images. The proposed new architecture ingests images and returns
scanpaths, a sequence of points featuring a high likelihood of catching
viewers' attention. We use an FCNN (Fully Convolutional Neural Network), in
which we exploit a differentiable channel-wise selection and Soft-Argmax
modules. We also incorporate learnable Gaussian distributions onto the network
bottleneck to simulate visual attention process bias in natural scene images.
Furthermore, to reduce the effect of shifts between different domains (i.e.
natural images, painting), we urge the model to learn unsupervised general
features from other domains using a gradient reversal classifier. The results
obtained by our model outperform existing state-of-the-art ones in terms of
accuracy and efficiency.
Related papers
- When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability.
We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks.
Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z) - Connectivity-Inspired Network for Context-Aware Recognition [1.049712834719005]
We focus on the effect of incorporating circuit motifs found in biological brains to address visual recognition.
Our convolutional architecture is inspired by the connectivity of human cortical and subcortical streams.
We present a new plug-and-play module to model context awareness.
arXiv Detail & Related papers (2024-09-06T15:42:10Z) - CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition [73.51329037954866]
We propose a robust global representation method with cross-image correlation awareness for visual place recognition.
Our method uses the attention mechanism to correlate multiple images within a batch.
Our method outperforms state-of-the-art methods by a large margin with significantly less training time.
arXiv Detail & Related papers (2024-02-29T15:05:11Z) - StyleEDL: Style-Guided High-order Attention Network for Image Emotion
Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL.
StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents.
In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z) - Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises [7.689542442882423]
We designed a dual-stream vision model inspired by the human brain.
This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation.
We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness.
arXiv Detail & Related papers (2022-06-15T03:44:42Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Enhancing Social Relation Inference with Concise Interaction Graph and
Discriminative Scene Representation [56.25878966006678]
We propose an approach of textbfPRactical textbfInference in textbfSocial rtextbfElation (PRISE)
It concisely learns interactive features of persons and discriminative features of holistic scenes.
PRISE achieves 6.8$%$ improvement for domain classification in PIPA dataset.
arXiv Detail & Related papers (2021-07-30T04:20:13Z) - Learning domain-agnostic visual representation for computational
pathology using medically-irrelevant style transfer augmentation [4.538771844947821]
STRAP (Style TRansfer Augmentation for histoPathology) is a form of data augmentation based on random style transfer from artistic paintings.
Style transfer replaces the low-level texture content of images with the uninformative style of randomly selected artistic paintings.
We demonstrate that STRAP leads to state-of-the-art performance, particularly in the presence of domain shifts.
arXiv Detail & Related papers (2021-02-02T18:50:16Z) - Insights From A Large-Scale Database of Material Depictions In Paintings [18.2193253052961]
We examine the give-and-take relationship between visual recognition systems and the rich information available in the fine arts.
We find that visual recognition systems designed for natural images can work surprisingly well on paintings.
We show that learning from paintings can be beneficial for neural networks that are intended to be used on natural images.
arXiv Detail & Related papers (2020-11-24T18:42:58Z) - Graph Neural Networks for UnsupervisedDomain Adaptation of
Histopathological ImageAnalytics [22.04114134677181]
We present a novel method for the unsupervised domain adaptation for histological image analysis.
It is based on a backbone for embedding images into a feature space, and a graph neural layer for propa-gating the supervision signals of images with labels.
In experiments, our methodachieves state-of-the-art performance on four public datasets.
arXiv Detail & Related papers (2020-08-21T04:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.