Related papers: Implicit Saliency in Deep Neural Networks

Implicit Saliency in Deep Neural Networks

URL: http://arxiv.org/abs/2008.01874v1
Date: Tue, 4 Aug 2020 23:14:24 GMT
Title: Implicit Saliency in Deep Neural Networks
Authors: Yutong Sun, Mohit Prabhushankar and Ghassan AlRegib
Abstract summary: In this paper, we show that existing recognition and localization deep architectures are capable of predicting the human visual saliency. We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion. Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms.
Score: 15.510581400494207
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we show that existing recognition and localization deep architectures, that have not been exposed to eye tracking data or any saliency datasets, are capable of predicting the human visual saliency. We term this as implicit saliency in deep neural networks. We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion. Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms. Additionally, the robustness outperforms those algorithms when we add large noise to the input images. Also, we show that semantic features contribute more than low-level features for human visual saliency detection.

Related papers

Human Scanpath Prediction in Target-Present Visual Search with Semantic-Foveal Bayesian Attention [49.99728312519117]
SemBA-FAST is a top-down framework designed for predicting human visual attention in target-present visual search.<n>We evaluate SemBA-FAST on the COCO-Search18 benchmark dataset, comparing its performance against other scanpath prediction models.<n>These findings provide valuable insights into the capabilities of semantic-foveal probabilistic frameworks for human-like attention modelling.
arXiv Detail & Related papers (2025-07-24T15:19:23Z)
Forward-Forward Learning achieves Highly Selective Latent Representations for Out-of-Distribution Detection in Fully Spiking Neural Networks [6.7236795813629]
Spiking Neural Networks (SNNs), inspired by biological systems, offer a promising avenue for overcoming limitations. In this work, we explore the potential of the spiking Forward-Forward Algorithm (FFA) to address these challenges. We propose a novel, gradient-free attribution method to detect features that drive a sample away from class distributions.
arXiv Detail & Related papers (2024-07-19T08:08:17Z)
Exploring Geometry of Blind Spots in Vision Models [56.47644447201878]
We study the phenomenon of under-sensitivity in vision models such as CNNs and Transformers. We propose a Level Set Traversal algorithm that iteratively explores regions of high confidence with respect to the input space. We estimate the extent of these connected higher-dimensional regions over which the model maintains a high degree of confidence.
arXiv Detail & Related papers (2023-10-30T18:00:33Z)
How deep convolutional neural networks lose spatial information with training [0.7328100870402177]
We show how stability to image diffeomorphisms is achieved by spatial pooling in the first half of the net, and by channel pooling in the second half. We find that the increased sensitivity to noise is due to the perturbing noise piling up during pooling, after being rectified by ReLU units.
arXiv Detail & Related papers (2022-10-04T10:21:03Z)
Deep Semantic Statistics Matching (D2SM) Denoising Network [70.01091467628068]
We introduce the Deep Semantic Statistics Matching (D2SM) Denoising Network. It exploits semantic features of pretrained classification networks, then it implicitly matches the probabilistic distribution of clear images at the semantic feature space. By learning to preserve the semantic distribution of denoised images, we empirically find our method significantly improves the denoising capabilities of networks.
arXiv Detail & Related papers (2022-07-19T14:35:42Z)
On the Sins of Image Synthesis Loss for Self-supervised Depth Estimation [60.780823530087446]
We show that improvements in image synthesis do not necessitate improvement in depth estimation. We attribute this diverging phenomenon to aleatoric uncertainties, which originate from data. This observed divergence has not been previously reported or studied in depth.
arXiv Detail & Related papers (2021-09-13T17:57:24Z)
Understanding Character Recognition using Visual Explanations Derived from the Human Visual System and Deep Networks [6.734853055176694]
We examine the congruence, or lack thereof, in the information-gathering strategies of deep neural networks. The deep learning model considered similar regions in character, which humans have fixated in the case of correctly classified characters. We propose to use the visual fixation maps obtained from the eye-tracking experiment as a supervisory input to align the model's focus on relevant character regions.
arXiv Detail & Related papers (2021-08-10T10:09:37Z)
Deep Feature Tracker: A Novel Application for Deep Convolutional Neural Networks [0.0]
We propose a novel and unified deep learning-based approach that can learn how to track features reliably. The proposed network dubbed as Deep-PT consists of a tracker network which is a convolutional neural network cross-correlation. The network is trained using multiple datasets due to the lack of specialized dataset for feature tracking datasets.
arXiv Detail & Related papers (2021-07-30T23:24:29Z)
Predicting Depth from Semantic Segmentation using Game Engine Dataset [0.0]
This thesis investigates the relation of perception of objects and depth estimation convolutional neural networks. We developed new network structures based on a simple depth estimation network that only used a single image at its input. Results show that our novel structures can improve the performance of depth estimation by 52% of relative error of distance.
arXiv Detail & Related papers (2021-06-12T10:15:40Z)
Leveraging Sparse Linear Layers for Debuggable Deep Networks [86.94586860037049]
We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. The resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.
arXiv Detail & Related papers (2021-05-11T08:15:25Z)
Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal. We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z)
Interpretation of Deep Temporal Representations by Selective Visualization of Internally Activated Nodes [24.228613156037532]
We propose two new frameworks to visualize temporal representations learned from deep neural networks. Our algorithm interprets the decision of temporal neural network by extracting highly activated periods. We characterize such sub-sequences with clustering and calculate the uncertainty of the suggested type and actual data.
arXiv Detail & Related papers (2020-04-27T01:45:55Z)
DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised Representation Learning [65.94499390875046]
DeFeat-Net is an approach to simultaneously learn a cross-domain dense feature representation. Our technique is able to outperform the current state-of-the-art with around 10% reduction in all error measures.
arXiv Detail & Related papers (2020-03-30T13:10:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.