B-cos Networks: Alignment is All We Need for Interpretability
- URL: http://arxiv.org/abs/2205.10268v1
- Date: Fri, 20 May 2022 16:03:29 GMT
- Title: B-cos Networks: Alignment is All We Need for Interpretability
- Authors: Moritz B\"ohle, Mario Fritz, Bernt Schiele
- Abstract summary: We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training.
A B-cos transform induces a single linear transform that faithfully summarises the full model computations.
We show that it can easily be integrated into common models such as VGGs, ResNets, InceptionNets, and DenseNets.
- Score: 136.27303006772294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new direction for increasing the interpretability of deep neural
networks (DNNs) by promoting weight-input alignment during training. For this,
we propose to replace the linear transforms in DNNs by our B-cos transform. As
we show, a sequence (network) of such transforms induces a single linear
transform that faithfully summarises the full model computations. Moreover, the
B-cos transform introduces alignment pressure on the weights during
optimisation. As a result, those induced linear transforms become highly
interpretable and align with task-relevant features. Importantly, the B-cos
transform is designed to be compatible with existing architectures and we show
that it can easily be integrated into common models such as VGGs, ResNets,
InceptionNets, and DenseNets, whilst maintaining similar performance on
ImageNet. The resulting explanations are of high visual quality and perform
well under quantitative metrics for interpretability. Code available at
https://www.github.com/moboehle/B-cos.
Related papers
- DuoFormer: Leveraging Hierarchical Visual Representations by Local and Global Attention [1.5624421399300303]
We propose a novel hierarchical transformer model that adeptly integrates the feature extraction capabilities of Convolutional Neural Networks (CNNs) with the advanced representational potential of Vision Transformers (ViTs)
Addressing the lack of inductive biases and dependence on extensive training datasets in ViTs, our model employs a CNN backbone to generate hierarchical visual representations.
These representations are then adapted for transformer input through an innovative patch tokenization.
arXiv Detail & Related papers (2024-07-18T22:15:35Z) - Self-Supervised Pre-Training for Table Structure Recognition Transformer [25.04573593082671]
We propose a self-supervised pre-training (SSP) method for table structure recognition transformers.
We discover that the performance gap between the linear projection transformer and the hybrid CNN-transformer can be mitigated by SSP of the visual encoder in the TSR model.
arXiv Detail & Related papers (2024-02-23T19:34:06Z) - B-cos Alignment for Inherently Interpretable CNNs and Vision
Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training.
We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations.
We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z) - Revisiting Transformation Invariant Geometric Deep Learning: Are Initial
Representations All You Need? [80.86819657126041]
We show that transformation-invariant and distance-preserving initial representations are sufficient to achieve transformation invariance.
Specifically, we realize transformation-invariant and distance-preserving initial point representations by modifying multi-dimensional scaling.
We prove that TinvNN can strictly guarantee transformation invariance, being general and flexible enough to be combined with the existing neural networks.
arXiv Detail & Related papers (2021-12-23T03:52:33Z) - Optimising for Interpretability: Convolutional Dynamic Alignment
Networks [108.83345790813445]
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets)
Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns.
CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions.
arXiv Detail & Related papers (2021-09-27T12:39:46Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Batch Normalization with Enhanced Linear Transformation [73.9885755599221]
properly enhancing a linear transformation module can effectively improve the ability of Batch normalization (BN)
Our method, named BNET, can be implemented with 2-3 lines of code in most deep learning libraries.
We verify that BNET accelerates the convergence of network training and enhances spatial information by assigning the important neurons with larger weights accordingly.
arXiv Detail & Related papers (2020-11-28T15:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.