Canonical Saliency Maps: Decoding Deep Face Models
- URL: http://arxiv.org/abs/2105.01386v1
- Date: Tue, 4 May 2021 09:42:56 GMT
- Title: Canonical Saliency Maps: Decoding Deep Face Models
- Authors: Thrupthi Ann John, Vineeth N Balasubramanian, C V Jawahar
- Abstract summary: We present 'Canonical Saliency Maps', a new method that highlights relevant facial areas by projecting saliency maps onto a canonical face model.
Our results show the usefulness of the proposed canonical saliency maps, which can be used on any deep face model regardless of the architecture.
- Score: 47.036036069156104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As Deep Neural Network models for face processing tasks approach human-like
performance, their deployment in critical applications such as law enforcement
and access control has seen an upswing, where any failure may have far-reaching
consequences. We need methods to build trust in deployed systems by making
their working as transparent as possible. Existing visualization algorithms are
designed for object recognition and do not give insightful results when applied
to the face domain. In this work, we present 'Canonical Saliency Maps', a new
method that highlights relevant facial areas by projecting saliency maps onto a
canonical face model. We present two kinds of Canonical Saliency Maps:
image-level maps and model-level maps. Image-level maps highlight facial
features responsible for the decision made by a deep face model on a given
image, thus helping to understand how a DNN made a prediction on the image.
Model-level maps provide an understanding of what the entire DNN model focuses
on in each task and thus can be used to detect biases in the model. Our
qualitative and quantitative results show the usefulness of the proposed
canonical saliency maps, which can be used on any deep face model regardless of
the architecture.
Related papers
- SHIC: Shape-Image Correspondences with no Keypoint Supervision [106.99157362200867]
Canonical surface mapping generalizes keypoint detection by assigning each pixel of an object to a corresponding point in a 3D template.
Popularised by DensePose for the analysis of humans, authors have attempted to apply the concept to more categories.
We introduce SHIC, a method to learn canonical maps without manual supervision which achieves better results than supervised methods for most categories.
arXiv Detail & Related papers (2024-07-26T17:58:59Z) - DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model [15.803614800117781]
We propose DiffMap, a novel approach specifically designed to model the structured priors of map segmentation masks.
By incorporating this technique, the performance of existing semantic segmentation methods can be significantly enhanced.
Our model demonstrates superior proficiency in generating results that more accurately reflect real-world map layouts.
arXiv Detail & Related papers (2024-05-03T11:16:27Z) - Faceptor: A Generalist Model for Face Perception [52.8066001012464]
Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture.
Layer-Attention into Faceptor enables the model to adaptively select features from optimal layers to perform the desired tasks.
Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition.
arXiv Detail & Related papers (2024-03-14T15:42:31Z) - SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic
Understanding [57.108301842535894]
We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images.
We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images.
SNAP can resolve the location of challenging image queries beyond the reach of traditional methods.
arXiv Detail & Related papers (2023-06-08T17:54:47Z) - Deriving Explanation of Deep Visual Saliency Models [6.808418311272862]
We develop a technique to derive explainable saliency models from their corresponding deep neural architecture based saliency models.
We consider two state-of-the-art deep saliency models, namely UNISAL and MSI-Net for our interpretation.
We also build our own deep saliency model named cross-concatenated multi-scale residual block based network (CMRNet) for saliency prediction.
arXiv Detail & Related papers (2021-09-08T12:22:32Z) - Multi-Branch Deep Radial Basis Function Networks for Facial Emotion
Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units.
RBF units capture local patterns shared by similar instances using an intermediate representation.
We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.