Human Attention in Fine-grained Classification
- URL: http://arxiv.org/abs/2111.01628v1
- Date: Tue, 2 Nov 2021 14:41:11 GMT
- Title: Human Attention in Fine-grained Classification
- Authors: Yao Rong, Wenjia Xu, Zeynep Akata, Enkelejda Kasneci
- Abstract summary: We validate that human attention contains valuable information for decision-making processes such as fine-grained classification.
We propose Gaze Augmentation Training (GAT) and Knowledge Fusion Network (KFN) to integrate human gaze into classification models.
- Score: 38.71613202835921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The way humans attend to, process and classify a given image has the
potential to vastly benefit the performance of deep learning models. Exploiting
where humans are focusing can rectify models when they are deviating from
essential features for correct decisions. To validate that human attention
contains valuable information for decision-making processes such as
fine-grained classification, we compare human attention and model explanations
in discovering important features. Towards this goal, we collect human gaze
data for the fine-grained classification dataset CUB and build a dataset named
CUB-GHA (Gaze-based Human Attention). Furthermore, we propose the Gaze
Augmentation Training (GAT) and Knowledge Fusion Network (KFN) to integrate
human gaze knowledge into classification models. We implement our proposals in
CUB-GHA and the recently released medical dataset CXR-Eye of chest X-ray
images, which includes gaze data collected from a radiologist. Our result
reveals that integrating human attention knowledge benefits classification
effectively, e.g. improving the baseline by 4.38% on CXR. Hence, our work
provides not only valuable insights into understanding human attention in
fine-grained classification, but also contributes to future research in
integrating human gaze with computer vision tasks. CUB-GHA and code are
available at https://github.com/yaorong0921/CUB-GHA.
Related papers
- Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - Multi-stages attention Breast cancer classification based on nonlinear
spiking neural P neurons with autapses [10.16176106140093]
Downsampling in deep networks may lead to loss of information.
We propose a multi-stages attention architecture based on NSNP neurons with autapses.
It achieves a classification accuracy of 96.32% at all magnification cases, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2023-12-20T06:52:38Z) - Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition [71.35205097460124]
How humans understand and recognize the actions of others is a complex neuroscientific problem.
LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
arXiv Detail & Related papers (2023-05-21T08:29:16Z) - Privacy-Preserved Neural Graph Similarity Learning [99.78599103903777]
We propose a novel Privacy-Preserving neural Graph Matching network model, named PPGM, for graph similarity learning.
To prevent reconstruction attacks, the proposed model does not communicate node-level representations between devices.
To alleviate the attacks to graph properties, the obfuscated features that contain information from both vectors are communicated.
arXiv Detail & Related papers (2022-10-21T04:38:25Z) - Leveraging Human Selective Attention for Medical Image Analysis with
Limited Training Data [72.1187887376849]
The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors.
We propose a framework to leverage gaze for medical image analysis tasks with small training data.
Our method is demonstrated to achieve superior performance on both 3D tumor segmentation and 2D chest X-ray classification tasks.
arXiv Detail & Related papers (2021-12-02T07:55:25Z) - Goal-Oriented Gaze Estimation for Zero-Shot Learning [62.52340838817908]
We introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization.
We aim to predict the actual human gaze location to get the visual attention regions for recognizing a novel object guided by attribute description.
This work implies the promising benefits of collecting human gaze dataset and automatic gaze estimation algorithms on high-level computer vision tasks.
arXiv Detail & Related papers (2021-03-05T02:14:57Z) - Human Activity Recognition Using Multichannel Convolutional Neural
Network [0.0]
Human Activity Recognition (HAR) simply refers to the capacity of a machine to perceive human actions.
This paper describes a supervised learning method that can distinguish human actions based on data collected from practical human movements.
The model was tested on the UCI HAR dataset, which resulted in a 95.25% classification accuracy.
arXiv Detail & Related papers (2021-01-17T16:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.