DISCOVER: Making Vision Networks Interpretable via Competition and
Dissection
- URL: http://arxiv.org/abs/2310.04929v1
- Date: Sat, 7 Oct 2023 21:57:23 GMT
- Title: DISCOVER: Making Vision Networks Interpretable via Competition and
Dissection
- Authors: Konstantinos P. Panousis, Sotirios Chatzis
- Abstract summary: This work contributes to post-hoc interpretability, and specifically Network Dissection.
Our goal is to present a framework that makes it easier to discover the individual functionality of each neuron in a network trained on a vision task.
- Score: 11.028520416752325
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern deep networks are highly complex and their inferential outcome very
hard to interpret. This is a serious obstacle to their transparent deployment
in safety-critical or bias-aware applications. This work contributes to
post-hoc interpretability, and specifically Network Dissection. Our goal is to
present a framework that makes it easier to discover the individual
functionality of each neuron in a network trained on a vision task; discovery
is performed in terms of textual description generation. To achieve this
objective, we leverage: (i) recent advances in multimodal vision-text models
and (ii) network layers founded upon the novel concept of stochastic local
competition between linear units. In this setting, only a small subset of layer
neurons are activated for a given input, leading to extremely high activation
sparsity (as low as only $\approx 4\%$). Crucially, our proposed method infers
(sparse) neuron activation patterns that enables the neurons to
activate/specialize to inputs with specific characteristics, diversifying their
individual functionality. This capacity of our method supercharges the
potential of dissection processes: human understandable descriptions are
generated only for the very few active neurons, thus facilitating the direct
investigation of the network's decision process. As we experimentally show, our
approach: (i) yields Vision Networks that retain or improve classification
performance, and (ii) realizes a principled framework for text-based
description and examination of the generated neuronal representations.
Related papers
- Identifying Sub-networks in Neural Networks via Functionally Similar Representations [41.028797971427124]
We take a step toward automating the understanding of the network by investigating the existence of distinct sub-networks.
Our approach offers meaningful insights into the behavior of neural networks with minimal human and computational cost.
arXiv Detail & Related papers (2024-10-21T20:19:00Z) - Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Simple and Effective Transfer Learning for Neuro-Symbolic Integration [50.592338727912946]
A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning.
Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task.
They suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima.
This paper proposes a simple yet effective method to ameliorate these problems.
arXiv Detail & Related papers (2024-02-21T15:51:01Z) - Understanding polysemanticity in neural networks through coding theory [0.8702432681310401]
We propose a novel practical approach to network interpretability and theoretical insights into polysemanticity and the density of codes.
We show how random projections can reveal whether a network exhibits a smooth or non-differentiable code and hence how interpretable the code is.
Our approach advances the pursuit of interpretability in neural networks, providing insights into their underlying structure and suggesting new avenues for circuit-level interpretability.
arXiv Detail & Related papers (2024-01-31T16:31:54Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Automated Natural Language Explanation of Deep Visual Neurons with Large
Models [43.178568768100305]
This paper proposes a novel post-hoc framework for generating semantic explanations of neurons with large foundation models.
Our framework is designed to be compatible with various model architectures and datasets, automated and scalable neuron interpretation.
arXiv Detail & Related papers (2023-10-16T17:04:51Z) - Neural Activation Patterns (NAPs): Visual Explainability of Learned
Concepts [8.562628320010035]
We present a method that takes into account the entire activation distribution.
By extracting similar activation profiles within the high-dimensional activation space of a neural network layer, we find groups of inputs that are treated similarly.
These input groups represent neural activation patterns (NAPs) and can be used to visualize and interpret learned layer concepts.
arXiv Detail & Related papers (2022-06-20T09:05:57Z) - Interpretable part-whole hierarchies and conceptual-semantic
relationships in neural networks [4.153804257347222]
We present Agglomerator, a framework capable of providing a representation of part-whole hierarchies from visual cues.
We evaluate our method on common datasets, such as SmallNORB, MNIST, FashionMNIST, CIFAR-10, and CIFAR-100.
arXiv Detail & Related papers (2022-03-07T10:56:13Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.