Related papers: Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention

Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention

URL: http://arxiv.org/abs/2204.04601v1
Date: Sun, 10 Apr 2022 04:57:56 GMT
Title: Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention
Authors: Yu Yang, Seungbae Kim, Jungseock Joo
Abstract summary: We propose a framework to teach any existing convolutional neural network to generate text descriptions about its own latent representations at the filter level. We show that our method can generate novel descriptions for learned filters beyond the set of categories defined in the training dataset. We also demonstrate a novel application of our method for unsupervised dataset bias analysis.
Score: 7.237370981736913
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Interpretability is an important property for visual models as it helps researchers and users understand the internal mechanism of a complex model. However, generating semantic explanations about the learned representation is challenging without direct supervision to produce such explanations. We propose a general framework, Latent Visual Semantic Explainer (LaViSE), to teach any existing convolutional neural network to generate text descriptions about its own latent representations at the filter level. Our method constructs a mapping between the visual and semantic spaces using generic image datasets, using images and category names. It then transfers the mapping to the target domain which does not have semantic labels. The proposed framework employs a modular structure and enables to analyze any trained network whether or not its original training data is available. We show that our method can generate novel descriptions for learned filters beyond the set of categories defined in the training dataset and perform an extensive evaluation on multiple datasets. We also demonstrate a novel application of our method for unsupervised dataset bias analysis which allows us to automatically discover hidden biases in datasets or compare different subsets without using additional labels. The dataset and code are made public to facilitate further research.

Related papers

Understanding Bias in Large-Scale Visual Datasets [5.042580324425314]
We propose a framework to identify the unique visual attributes distinguishing large-scale visual datasets. Our approach applies various transformations to extract semantic, structural, boundary, color, and frequency information. We generate detailed, open-ended descriptions of each dataset's characteristics.
arXiv Detail & Related papers (2024-12-02T18:56:52Z)
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data. We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation. Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z)
Linking in Style: Understanding learned features in deep learning models [0.0]
Convolutional neural networks (CNNs) learn abstract features to perform object classification. We propose an automatic method to visualize and systematically analyze learned features in CNNs.
arXiv Detail & Related papers (2024-09-25T12:28:48Z)
Open-Vocabulary Camouflaged Object Segmentation [66.94945066779988]
We introduce a new task, open-vocabulary camouflaged object segmentation (OVCOS) We construct a large-scale complex scene dataset (textbfOVCamo) containing 11,483 hand-selected images with fine annotations and corresponding object classes. By integrating the guidance of class semantic knowledge and the supplement of visual structure cues from the edge and depth information, the proposed method can efficiently capture camouflaged objects.
arXiv Detail & Related papers (2023-11-19T06:00:39Z)
TAX: Tendency-and-Assignment Explainer for Semantic Segmentation with Multi-Annotators [31.36818611460614]
Tendency-and-Assignment Explainer (TAX) is designed to offer interpretability at the annotator and assignment levels. We show that our TAX can be applied to state-of-the-art network architectures with comparable performances.
arXiv Detail & Related papers (2023-02-19T12:40:22Z)
Extracting Semantic Knowledge from GANs with Unsupervised Learning [65.32631025780631]
Generative Adversarial Networks (GANs) encode semantics in feature maps in a linearly separable form. We propose a novel clustering algorithm, named KLiSH, which leverages the linear separability to cluster GAN's features. KLiSH succeeds in extracting fine-grained semantics of GANs trained on datasets of various objects.
arXiv Detail & Related papers (2022-11-30T03:18:16Z)
The SVD of Convolutional Weights: A CNN Interpretability Framework [3.5783190448496343]
We propose a framework against which interpretability methods might be applied using hypergraphs to model class separation. Rather than looking to the activations to explain the network, we use the singular vectors with the greatest corresponding singular values for each linear layer to identify those features most important to the network.
arXiv Detail & Related papers (2022-08-14T18:23:02Z)
CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance. In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Latent Feature Representation via Unsupervised Learning for Pattern Discovery in Massive Electron Microscopy Image Volumes [4.278591555984395]
In particular, we give an unsupervised deep learning approach to learning a latent representation that captures semantic similarity in the data set. We demonstrate the utility of our method applied to nano-scale electron microscopy data, where even relatively small portions of animal brains can require terabytes of image data.
arXiv Detail & Related papers (2020-12-22T17:14:19Z)
Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.