Related papers: Interpreting Face Inference Models using Hierarchical Network Dissection

Interpreting Face Inference Models using Hierarchical Network Dissection

URL: http://arxiv.org/abs/2108.10360v1
Date: Mon, 23 Aug 2021 18:52:47 GMT
Title: Interpreting Face Inference Models using Hierarchical Network Dissection
Authors: Divyang Teotia, Agata Lapedriza, Sarah Ostadabbas
Abstract summary: Hierarchical Network Dissection is a pipeline to interpret the internal representation of face-centric inference models. Our pipeline is inspired by Network Dissection, a popular interpretability model for object-centric and scene-centric models.
Score: 10.852613235927958
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper presents Hierarchical Network Dissection, a general pipeline to interpret the internal representation of face-centric inference models. Using a probabilistic formulation, Hierarchical Network Dissection pairs units of the model with concepts in our "Face Dictionary" (a collection of facial concepts with corresponding sample images). Our pipeline is inspired by Network Dissection, a popular interpretability model for object-centric and scene-centric models. However, our formulation allows to deal with two important challenges of face-centric models that Network Dissection cannot address: (1) spacial overlap of concepts: there are different facial concepts that simultaneously occur in the same region of the image, like "nose" (facial part) and "pointy nose" (facial attribute); and (2) global concepts: there are units with affinity to concepts that do not refer to specific locations of the face (e.g. apparent age). To validate the effectiveness of our unit-concept pairing formulation, we first conduct controlled experiments on biased data. These experiments illustrate how Hierarchical Network Dissection can be used to discover bias in the training data. Then, we dissect different face-centric inference models trained on widely-used facial datasets. The results show models trained for different tasks have different internal representations. Furthermore, the interpretability results reveal some biases in the training data and some interesting characteristics of the face-centric inference tasks.

Related papers

Concept Probing: Where to Find Human-Defined Concepts (Extended Version) [3.2443914909457594]
We propose a method to automatically identify which layer's representations in a neural network model should be considered when probing for a given human-defined concept of interest.<n>We validate our findings through an exhaustive empirical analysis over different neural network models and datasets.
arXiv Detail & Related papers (2025-07-24T16:30:10Z)
On the Performance of Concept Probing: The Influence of the Data (Extended Version) [3.2443914909457594]
Concept probing works by training additional classifiers to map the internal representations of a model into human-defined concepts of interest.<n>Research on concept probing has mainly focused on the model being probed or the probing model itself.<n>In this paper, we investigate the effect of the data used to train probing models on their performance.
arXiv Detail & Related papers (2025-07-24T16:18:46Z)
Representational Similarity via Interpretable Visual Concepts [27.72186215265676]
We introduce an interpretable representational similarity method to compare two networks. We show that some aspects of model differences can be attributed to unique concepts discovered by one model that are not well represented in the other.
arXiv Detail & Related papers (2025-03-19T21:21:45Z)
Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration. In this work, we tackle the task of reconstructing closely interactive humans from a monocular video. We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z)
Appearance Debiased Gaze Estimation via Stochastic Subject-Wise Adversarial Learning [33.55397868171977]
Appearance-based gaze estimation has been attracting attention in computer vision, and remarkable improvements have been achieved using various deep learning techniques. We propose a novel framework: subject-wise gaZE learning (SAZE), which trains a network to generalize the appearance of subjects. Our experimental results verify the robustness of the method in that it yields state-of-the-art performance, achieving 3.89 and 4.42 on the MPIIGaze and EyeDiap datasets, respectively.
arXiv Detail & Related papers (2024-01-25T00:23:21Z)
Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories. We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation. In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z)
FaceTopoNet: Facial Expression Recognition using Face Topology Learning [23.139108533273777]
We propose an end-to-end deep model for facial expression recognition, which is capable of learning an effective tree topology of the face. Our model then traverses the learned tree to generate a sequence, which is then used to form an embedding to feed a sequential learner. We perform extensive experiments on four large-scale in-the-wild facial expression datasets to evaluate our approach.
arXiv Detail & Related papers (2022-09-13T22:02:54Z)
Drawing out of Distribution with Neuro-Symbolic Generative Models [49.79371715591122]
Drawing out of Distribution is a neuro-symbolic generative model of stroke-based drawing. DooD operates directly on images, requires no supervision or expensive test-time inference. We evaluate DooD on its ability to generalise across both data and tasks.
arXiv Detail & Related papers (2022-06-03T21:40:22Z)
Unsupervised learning of features and object boundaries from local prediction [0.0]
We introduce a layer of feature maps with a pairwise Markov random field model in which each factor is paired with an additional binary variable, which switches the factor on or off. We can learn both the features and the parameters of the Markov random field factors from images without further supervision signals. We show that computing predictions across space aids both segmentation and feature learning, and models trained to optimize these predictions show similarities to the human visual system.
arXiv Detail & Related papers (2022-05-27T18:54:10Z)
Expression-preserving face frontalization improves visually assisted speech processing [35.647888055229956]
The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations. We show that the method, when incorporated into deep learning pipelines, improves word recognition and speech intelligibilty scores by a considerable margin.
arXiv Detail & Related papers (2022-04-06T13:22:24Z)
Multi-Branch Deep Radial Basis Function Networks for Facial Emotion Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units. RBF units capture local patterns shared by similar instances using an intermediate representation. We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z)
The FaceChannel: A Fast & Furious Deep Neural Network for Facial Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train. We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks. We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z)
Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models. We first find that GANs learn various semantics in some linear subspaces of the latent space. We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.