Now You See Me (CME): Concept-based Model Extraction
- URL: http://arxiv.org/abs/2010.13233v1
- Date: Sun, 25 Oct 2020 22:03:45 GMT
- Title: Now You See Me (CME): Concept-based Model Extraction
- Authors: Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik, Pietro Li\`o, Adrian
Weller
- Abstract summary: Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks.
Key step to further empowering DNN-based approaches is improving their explainability.
We present CME: a concept-based model extraction framework.
- Score: 24.320487188704146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks (DNNs) have achieved remarkable performance on a range
of tasks. A key step to further empowering DNN-based approaches is improving
their explainability. In this work we present CME: a concept-based model
extraction framework, used for analysing DNN models via concept-based extracted
models. Using two case studies (dSprites, and Caltech UCSD Birds), we
demonstrate how CME can be used to (i) analyse the concept information learned
by a DNN model (ii) analyse how a DNN uses this concept information when
predicting output labels (iii) identify key concept information that can
further improve DNN predictive performance (for one of the case studies, we
showed how model accuracy can be improved by over 14%, using only 30% of the
available concepts).
Related papers
- Deep Companion Learning: Enhancing Generalization Through Historical Consistency [35.5237083057451]
We propose a novel training method for Deep Neural Networks (DNNs) that enhances generalization by penalizing inconsistent model predictions.
We train a deep-companion model (DCM) by using previous versions of the model to provide forecasts on new inputs.
This companion model deciphers a meaningful latent semantic structure within the data, thereby providing targeted supervision.
arXiv Detail & Related papers (2024-07-26T15:31:13Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - NxPlain: Web-based Tool for Discovery of Latent Concepts [16.446370662629555]
We present NxPlain, a web application that provides an explanation of a model's prediction using latent concepts.
NxPlain discovers latent concepts learned in a deep NLP model, provides an interpretation of the knowledge learned in the model, and explains its predictions based on the used concepts.
arXiv Detail & Related papers (2023-03-06T10:45:24Z) - Data-Free Adversarial Knowledge Distillation for Graph Neural Networks [62.71646916191515]
We propose the first end-to-end framework for data-free adversarial knowledge distillation on graph structured data (DFAD-GNN)
To be specific, our DFAD-GNN employs a generative adversarial network, which mainly consists of three components: a pre-trained teacher model and a student model are regarded as two discriminators, and a generator is utilized for deriving training graphs to distill knowledge from the teacher model into the student model.
Our DFAD-GNN significantly surpasses state-of-the-art data-free baselines in the graph classification task.
arXiv Detail & Related papers (2022-05-08T08:19:40Z) - Comparison Analysis of Traditional Machine Learning and Deep Learning
Techniques for Data and Image Classification [62.997667081978825]
The purpose of the study is to analyse and compare the most common machine learning and deep learning techniques used for computer vision 2D object classification tasks.
Firstly, we will present the theoretical background of the Bag of Visual words model and Deep Convolutional Neural Networks (DCNN)
Secondly, we will implement a Bag of Visual Words model, the VGG16 CNN Architecture.
arXiv Detail & Related papers (2022-04-11T11:34:43Z) - Algorithmic Concept-based Explainable Reasoning [0.3149883354098941]
Recent research on graph neural network (GNN) models successfully applied GNNs to classical graph algorithms and optimisation problems.
Key hindrance of these approaches is their lack of explainability, since GNNs are black-box models that cannot be interpreted directly.
We introduce concept-bottleneck GNNs, which rely on a modification to the GNN readout mechanism.
arXiv Detail & Related papers (2021-07-15T17:44:51Z) - MEME: Generating RNN Model Explanations via Model Extraction [6.55705721360334]
MEME is a model extraction approach capable of approximating RNNs with interpretable models represented by human-understandable concepts and their interactions.
We show how MEME can be used to interpret RNNs both locally and globally, by approximating RNN decision-making via interpretable concept interactions.
arXiv Detail & Related papers (2020-12-13T04:00:08Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Modeling Token-level Uncertainty to Learn Unknown Concepts in SLU via
Calibrated Dirichlet Prior RNN [98.4713940310056]
One major task of spoken language understanding (SLU) in modern personal assistants is to extract semantic concepts from an utterance.
Recent research collected question and answer annotated data to learn what is unknown and should be asked.
We incorporate softmax-based slot filling neural architectures to model the sequence uncertainty without question supervision.
arXiv Detail & Related papers (2020-10-16T02:12:30Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - Explaining Deep Neural Networks using Unsupervised Clustering [12.639074798397619]
We propose a novel method to explain trained deep neural networks (DNNs) by distilling them into surrogate models using unsupervised clustering.
Our method can be applied flexibly to any subset of layers of a DNN architecture and can incorporate low-level and high-level information.
arXiv Detail & Related papers (2020-07-15T04:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.