Relevance Prediction from Eye-movements Using Semi-interpretable
Convolutional Neural Networks
- URL: http://arxiv.org/abs/2001.05152v1
- Date: Wed, 15 Jan 2020 07:02:14 GMT
- Title: Relevance Prediction from Eye-movements Using Semi-interpretable
Convolutional Neural Networks
- Authors: Nilavra Bhattacharya, Somnath Rakshit, Jacek Gwizdka, Paul Kogut
- Abstract summary: We propose an image-classification method to predict the perceived-relevance of text documents from eye-movements.
An eye-tracking study was conducted where participants read short news articles, and rated them as relevant or irrelevant for answering a trigger question.
We encode participants' eye-movement scanpaths as images, and then train a convolutional neural network classifier using these scanpath images.
- Score: 9.007191808968242
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an image-classification method to predict the perceived-relevance
of text documents from eye-movements. An eye-tracking study was conducted where
participants read short news articles, and rated them as relevant or irrelevant
for answering a trigger question. We encode participants' eye-movement
scanpaths as images, and then train a convolutional neural network classifier
using these scanpath images. The trained classifier is used to predict
participants' perceived-relevance of news articles from the corresponding
scanpath images. This method is content-independent, as the classifier does not
require knowledge of the screen-content, or the user's information-task. Even
with little data, the image classifier can predict perceived-relevance with up
to 80% accuracy. When compared to similar eye-tracking studies from the
literature, this scanpath image classification method outperforms previously
reported metrics by appreciable margins. We also attempt to interpret how the
image classifier differentiates between scanpaths on relevant and irrelevant
documents.
Related papers
- Guided Interpretable Facial Expression Recognition via Spatial Action Unit Cues [55.97779732051921]
A new learning strategy is proposed to explicitly incorporate au cues into classifier training.
We show that our strategy can improve layer-wise interpretability without degrading classification performance.
arXiv Detail & Related papers (2024-02-01T02:13:49Z) - iCAR: Bridging Image Classification and Image-text Alignment for Visual
Recognition [33.2800417526215]
Image classification, which classifies images by pre-defined categories, has been the dominant approach to visual representation learning over the last decade.
Visual learning through image-text alignment, however, has emerged to show promising performance, especially for zero-shot recognition.
We propose a deep fusion method with three adaptations that effectively bridge two learning tasks.
arXiv Detail & Related papers (2022-04-22T15:27:21Z) - Knowledge Mining with Scene Text for Fine-Grained Recognition [53.74297368412834]
We propose an end-to-end trainable network that mines implicit contextual knowledge behind scene text image.
We employ KnowBert to retrieve relevant knowledge for semantic representation and combine it with image features for fine-grained classification.
Our method outperforms the state-of-the-art by 3.72% mAP and 5.39% mAP, respectively.
arXiv Detail & Related papers (2022-03-27T05:54:00Z) - Exploiting the relationship between visual and textual features in
social networks for image classification with zero-shot deep learning [0.0]
In this work, we propose a classifier ensemble based on the transferable learning capabilities of the CLIP neural network architecture.
Our experiments, based on image classification tasks according to the labels of the Places dataset, are performed by first considering only the visual part.
Considering the associated texts to the images can help to improve the accuracy depending on the goal.
arXiv Detail & Related papers (2021-07-08T10:54:59Z) - Telling the What while Pointing the Where: Fine-grained Mouse Trace and
Language Supervision for Improved Image Retrieval [60.24860627782486]
Fine-grained image retrieval often requires the ability to also express the where in the image the content they are looking for is.
In this paper, we describe an image retrieval setup where the user simultaneously describes an image using both spoken natural language (the "what") and mouse traces over an empty canvas (the "where")
Our model is capable of taking this spatial guidance into account, and provides more accurate retrieval results compared to text-only equivalent systems.
arXiv Detail & Related papers (2021-02-09T17:54:34Z) - Intrinsic Image Captioning Evaluation [53.51379676690971]
We propose a learning based metrics for image captioning, which we call Intrinsic Image Captioning Evaluation(I2CE)
Experiment results show that our proposed method can keep robust performance and give more flexible scores to candidate captions when encountered with semantic similar expression or less aligned semantics.
arXiv Detail & Related papers (2020-12-14T08:36:05Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Using Unlabeled Data for Increasing Low-Shot Classification Accuracy of
Relevant and Open-Set Irrelevant Images [0.4110108749051655]
In search, exploration, and reconnaissance tasks performed with autonomous ground vehicles, an image classification capability is needed.
We present an open-set low-shot classifier that uses, during its training, a modest number of labeled images for each relevant class.
It is capable of identifying images from the relevant classes, determining when a candidate image is irrelevant, and it can further recognize categories of irrelevant images that were not included in the training.
arXiv Detail & Related papers (2020-10-01T23:11:07Z) - Learning unbiased zero-shot semantic segmentation networks via
transductive transfer [14.55508599873219]
We propose an easy-to-implement transductive approach to alleviate the prediction bias in zero-shot semantic segmentation.
Our method assumes both the source images with full pixel-level labels and unlabeled target images are available during training.
arXiv Detail & Related papers (2020-07-01T14:25:13Z) - Deep semantic gaze embedding and scanpath comparison for expertise
classification during OPT viewing [6.700983301090583]
We present a novel approach to gaze scanpath comparison that incorporates convolutional neural networks (CNN)
Our approach was capable of distinguishing experts from novices with 93% accuracy while incorporating the image semantics.
arXiv Detail & Related papers (2020-03-31T07:00:59Z) - Geometrically Mappable Image Features [85.81073893916414]
Vision-based localization of an agent in a map is an important problem in robotics and computer vision.
We propose a method that learns image features targeted for image-retrieval-based localization.
arXiv Detail & Related papers (2020-03-21T15:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.