Related papers: DISentangled Counterfactual Visual interpretER (DISCOVER) generalizes to natural images

DISentangled Counterfactual Visual interpretER (DISCOVER) generalizes to natural images

URL: http://arxiv.org/abs/2406.15918v1
Date: Sat, 22 Jun 2024 19:05:50 GMT
Title: DISentangled Counterfactual Visual interpretER (DISCOVER) generalizes to natural images
Authors: Oded Rotem, Assaf Zaritsky,
Abstract summary: We show that DISentangled COunterfactual Visual interpretER (DISCOVER) can be applied to the domain of natural images. First, DISCOVER visually interpreted the nose size, the muzzle area, and the face size as semantic discriminative visual traits discriminating between facial images of dogs versus cats. Second, DISCOVER visually interpreted the cheeks and jawline, eyebrows and hair, and the eyes, as discriminative facial characteristics.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We recently presented DISentangled COunterfactual Visual interpretER (DISCOVER), a method toward systematic visual interpretability of image-based classification models and demonstrated its applicability to two biomedical domains. Here we demonstrate that DISCOVER can be applied to the domain of natural images. First, DISCOVER visually interpreted the nose size, the muzzle area, and the face size as semantic discriminative visual traits discriminating between facial images of dogs versus cats. Second, DISCOVER visually interpreted the cheeks and jawline, eyebrows and hair, and the eyes, as discriminative facial characteristics. These successful visual interpretations across two natural images domains indicate that DISCOVER is a generalized interpretability method.

Related papers

When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability. We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks. Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z)
Knowledge-Enhanced Facial Expression Recognition with Emotional-to-Neutral Transformation [66.53435569574135]
Existing facial expression recognition methods typically fine-tune a pre-trained visual encoder using discrete labels. We observe that the rich knowledge in text embeddings, generated by vision-language models, is a promising alternative for learning discriminative facial expression representations. We propose a novel knowledge-enhanced FER method with an emotional-to-neutral transformation.
arXiv Detail & Related papers (2024-09-13T07:28:57Z)
How Do You Perceive My Face? Recognizing Facial Expressions in Multi-Modal Context by Modeling Mental Representations [5.895694050664867]
We introduce a novel approach for facial expression classification that goes beyond simple classification tasks. Our model accurately classifies a perceived face and synthesizes the corresponding mental representation perceived by a human when observing a face in context. We evaluate synthesized expressions in a human study, showing that our model effectively produces approximations of human mental representations.
arXiv Detail & Related papers (2024-09-04T09:32:40Z)
Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer Learning of Facial Expression Recognition [62.997667081978825]
We propose a biologically-inspired mechanism for transfer learning in facial expression recognition. Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes. Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
arXiv Detail & Related papers (2023-04-05T09:06:30Z)
A domain adaptive deep learning solution for scanpath prediction of paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings. We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans. The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z)
Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises [7.689542442882423]
We designed a dual-stream vision model inspired by the human brain. This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation. We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness.
arXiv Detail & Related papers (2022-06-15T03:44:42Z)
Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex. Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex. We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z)
InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models. We first find that GANs learn various semantics in some linear subspaces of the latent space. We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.