Related papers: Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs

URL: http://arxiv.org/abs/2008.01777v1
Date: Tue, 4 Aug 2020 19:27:46 GMT
Title: Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs
Authors: Robin Rombach, Patrick Esser, Bj\"orn Ommer
Abstract summary: We present an approach based on INNs that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these invariances combined with the model representation into an equally expressive one with accessible semantic concepts. Our invertible approach significantly extends the abilities to understand black box models by enabling post-hoc interpretations of state-of-the-art networks without compromising their performance.
Score: 19.398202091883366
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To tackle increasingly complex tasks, it has become an essential ability of neural networks to learn abstract representations. These task-specific representations and, particularly, the invariances they capture turn neural networks into black box models that lack interpretability. To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to. We present an approach based on INNs that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these recovered invariances combined with the model representation into an equally expressive one with accessible semantic concepts. As a consequence, neural network representations become understandable by providing the means to (i) expose their semantic meaning, (ii) semantically modify a representation, and (iii) visualize individual learned semantic concepts and invariances. Our invertible approach significantly extends the abilities to understand black box models by enabling post-hoc interpretations of state-of-the-art networks without compromising their performance. Our implementation is available at https://compvis.github.io/invariances/ .

Related papers

Improving vision-language alignment with graph spiking hybrid Networks [10.88584928028832]
This paper proposes a comprehensive visual semantic representation module, necessitating the utilization of panoptic segmentation to generate fine-grained semantic features. We propose a novel Graph Spiking Hybrid Network (GSHN) that integrates the complementary advantages of Spiking Neural Networks (SNNs) and Graph Attention Networks (GATs) to encode visual semantic information.
arXiv Detail & Related papers (2025-01-31T11:55:17Z)
Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales [54.78115855552886]
We show how to construct over-complete invariants with a Convolutional Neural Networks (CNN)-like hierarchical architecture. With the over-completeness, discriminative features w.r.t. the task can be adaptively formed in a Neural Architecture Search (NAS)-like manner. For robust and interpretable vision tasks at larger scales, hierarchical invariant representation can be considered as an effective alternative to traditional CNN and invariants.
arXiv Detail & Related papers (2024-02-23T16:50:07Z)
Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process. We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z)
Labeling Neural Representations with Inverse Recognition [25.867702786273586]
Inverse Recognition (INVERT) is a scalable approach for connecting learned representations with human-understandable concepts. In contrast to prior work, INVERT is capable of handling diverse types of neurons, exhibits less computational complexity, and does not rely on the availability of segmentation masks. We demonstrate the applicability of INVERT in various scenarios, including the identification of representations affected by spurious correlations.
arXiv Detail & Related papers (2023-11-22T18:55:25Z)
On the Transition from Neural Representation to Symbolic Knowledge [2.2528422603742304]
We propose a Neural-Symbolic Transitional Dictionary Learning (TDL) framework that employs an EM algorithm to learn a transitional representation of data. We implement the framework with a diffusion model by regarding the decomposition of input as a cooperative game. We additionally use RL enabled by the Markovian of diffusion models to further tune the learned prototypes.
arXiv Detail & Related papers (2023-08-03T19:29:35Z)
SO(2) and O(2) Equivariance in Image Recognition with Bessel-Convolutional Neural Networks [63.24965775030674]
This work presents the development of Bessel-convolutional neural networks (B-CNNs) B-CNNs exploit a particular decomposition based on Bessel functions to modify the key operation between images and filters. Study is carried out to assess the performances of B-CNNs compared to other methods.
arXiv Detail & Related papers (2023-04-18T18:06:35Z)
Invariant Causal Mechanisms through Distribution Matching [86.07327840293894]
In this work we provide a causal perspective and a new algorithm for learning invariant representations. Empirically we show that this algorithm works well on a diverse set of tasks and in particular we observe state-of-the-art performance on domain generalization.
arXiv Detail & Related papers (2022-06-23T12:06:54Z)
Fair Interpretable Representation Learning with Correction Vectors [60.0806628713968]
We propose a new framework for fair representation learning that is centered around the learning of "correction vectors" We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance.
arXiv Detail & Related papers (2022-02-07T11:19:23Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond [2.9005223064604078]
Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information. We introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants. We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD.
arXiv Detail & Related papers (2021-05-26T10:14:22Z)
Learning Semantically Meaningful Features for Interpretable Classifications [17.88784870849724]
SemCNN learns associations between visual features and word phrases. Experiment results on multiple benchmark datasets demonstrate that SemCNN can learn features with clear semantic meaning.
arXiv Detail & Related papers (2021-01-11T14:35:16Z)
A Disentangling Invertible Interpretation Network for Explaining Latent Representations [19.398202091883366]
We formulate interpretation as a translation of hidden representations onto semantic concepts that are comprehensible to the user. The proposed invertible interpretation network can be transparently applied on top of existing architectures. We present an efficient approach to define semantic concepts by only sketching two images and also an unsupervised strategy.
arXiv Detail & Related papers (2020-04-27T20:43:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.