Counterfactual Explanations for Models of Code
- URL: http://arxiv.org/abs/2111.05711v1
- Date: Wed, 10 Nov 2021 14:44:19 GMT
- Title: Counterfactual Explanations for Models of Code
- Authors: J\"urgen Cito, Isil Dillig, Vijayaraghavan Murali, Satish Chandra
- Abstract summary: Machine learning (ML) models play an increasingly prevalent role in many software engineering tasks.
It can be difficult for developers to understand why the model came to a certain conclusion and how to act upon the model's prediction.
This paper explores counterfactual explanations for models of source code.
- Score: 11.678590247866534
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Machine learning (ML) models play an increasingly prevalent role in many
software engineering tasks. However, because most models are now powered by
opaque deep neural networks, it can be difficult for developers to understand
why the model came to a certain conclusion and how to act upon the model's
prediction. Motivated by this problem, this paper explores counterfactual
explanations for models of source code. Such counterfactual explanations
constitute minimal changes to the source code under which the model "changes
its mind". We integrate counterfactual explanation generation to models of
source code in a real-world setting. We describe considerations that impact
both the ability to find realistic and plausible counterfactual explanations,
as well as the usefulness of such explanation to the user of the model. In a
series of experiments we investigate the efficacy of our approach on three
different models, each based on a BERT-like architecture operating over source
code.
Related papers
- Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - Toward a Theory of Causation for Interpreting Neural Code Models [49.906221295459275]
This paper introduces $do_code$, a post hoc interpretability method specific to Neural Code Models (NCMs)
$do_code$ is based upon causal inference to enable language-oriented explanations.
Results show that our studied NCMs are sensitive to changes in code syntax.
arXiv Detail & Related papers (2023-02-07T22:56:58Z) - OCTET: Object-aware Counterfactual Explanations [29.532969342297086]
We propose an object-centric framework for counterfactual explanation generation.
Our method, inspired by recent generative modeling works, encodes the query image into a latent space that is structured to ease object-level manipulations.
We conduct a set of experiments on counterfactual explanation benchmarks for driving scenes, and we show that our method can be adapted beyond classification.
arXiv Detail & Related papers (2022-11-22T16:23:12Z) - Motif-guided Time Series Counterfactual Explanations [1.1510009152620664]
We propose a novel model that generates intuitive post-hoc counterfactual explanations.
We validated our model using five real-world time-series datasets from the UCR repository.
arXiv Detail & Related papers (2022-11-08T17:56:50Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Demystifying Code Summarization Models [5.608277537412537]
We evaluate four prominent code summarization models: extreme summarizer, code2vec, code2seq, and sequence GNN.
Results show that all models base their predictions on syntactic and lexical properties with little to none semantic implication.
We present a novel approach to explaining the predictions of code summarization models through the lens of training data.
arXiv Detail & Related papers (2021-02-09T03:17:46Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.