Syntax-Guided Program Reduction for Understanding Neural Code
Intelligence Models
- URL: http://arxiv.org/abs/2205.14374v1
- Date: Sat, 28 May 2022 09:04:57 GMT
- Title: Syntax-Guided Program Reduction for Understanding Neural Code
Intelligence Models
- Authors: Md Rafiqul Islam Rabin, Aftab Hussain, Mohammad Amin Alipour
- Abstract summary: We show that a syntax-guided program reduction technique is faster and provides smaller sets of key tokens in reduced programs.
We also show that the key tokens could be used in generating adversarial examples for up to 65% of the input programs.
- Score: 1.1924369482115011
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural code intelligence (CI) models are opaque black-boxes and offer little
insight on the features they use in making predictions. This opacity may lead
to distrust in their prediction and hamper their wider adoption in
safety-critical applications. Recently, input program reduction techniques have
been proposed to identify key features in the input programs to improve the
transparency of CI models. However, this approach is syntax-unaware and does
not consider the grammar of the programming language. In this paper, we apply a
syntax-guided program reduction technique that considers the grammar of the
input programs during reduction. Our experiments on multiple models across
different types of input programs show that the syntax-guided program reduction
technique is faster and provides smaller sets of key tokens in reduced
programs. We also show that the key tokens could be used in generating
adversarial examples for up to 65% of the input programs.
Related papers
- Extracting Label-specific Key Input Features for Neural Code
Intelligence Models [0.0]
Code intelligence (CI) models are often black-box and do not offer insights on the input features that they learn for making correct predictions.
In this paper, we apply a syntax-guided program reduction technique that follows the syntax of input programs during reduction.
arXiv Detail & Related papers (2022-02-14T03:36:35Z) - Encoding Program as Image: Evaluating Visual Representation of Source
Code [2.1016374925364616]
We investigate Code2Snapshot, a novel representation of the source code based on the snapshots of input programs.
We compare its performance with state-of-the-art representations that utilize the rich syntactic and semantic features of input programs.
arXiv Detail & Related papers (2021-11-01T17:07:02Z) - Tea: Program Repair Using Neural Network Based on Program Information
Attention Matrix [14.596847020236657]
We propose a unified representation to capture the syntax, data flow, and control flow aspects of software programs.
We then devise a method to use such a representation to guide the transformer model from NLP in better understanding and fixing buggy programs.
arXiv Detail & Related papers (2021-07-17T15:49:22Z) - Enforcing Consistency in Weakly Supervised Semantic Parsing [68.2211621631765]
We explore the use of consistency between the output programs for related inputs to reduce the impact of spurious programs.
We find that a more consistent formalism leads to improved model performance even without consistency-based training.
arXiv Detail & Related papers (2021-07-13T03:48:04Z) - Latent Execution for Neural Program Synthesis Beyond Domain-Specific
Languages [97.58968222942173]
We take the first step to synthesize C programs from input-output examples.
In particular, we propose La Synth, which learns the latent representation to approximate the execution of partially generated programs.
We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis.
arXiv Detail & Related papers (2021-06-29T02:21:32Z) - Improving Compositionality of Neural Networks by Decoding
Representations to Inputs [83.97012077202882]
We bridge the benefits of traditional and deep learning programs by jointly training a generative model to constrain neural network activations to "decode" back to inputs.
We demonstrate applications of decodable representations to out-of-distribution detection, adversarial examples, calibration, and fairness.
arXiv Detail & Related papers (2021-06-01T20:07:16Z) - How could Neural Networks understand Programs? [67.4217527949013]
It is difficult to build a model to better understand programs, by either directly applying off-the-shelf NLP pre-training techniques to the source code, or adding features to the model by theshelf.
We propose a novel program semantics learning paradigm, that the model should learn from information composed of (1) the representations which align well with the fundamental operations in operational semantics, and (2) the information of environment transition.
arXiv Detail & Related papers (2021-05-10T12:21:42Z) - Representing Partial Programs with Blended Abstract Semantics [62.20775388513027]
We introduce a technique for representing partially written programs in a program synthesis engine.
We learn an approximate execution model implemented as a modular neural network.
We show that these hybrid neuro-symbolic representations enable execution-guided synthesizers to use more powerful language constructs.
arXiv Detail & Related papers (2020-12-23T20:40:18Z) - Latent Programmer: Discrete Latent Codes for Program Synthesis [56.37993487589351]
In many sequence learning tasks, such as program synthesis and document summarization, a key problem is searching over a large space of possible output sequences.
We propose to learn representations of the outputs that are specifically meant for search: rich enough to specify the desired output but compact enough to make search more efficient.
We introduce the emphLatent Programmer, a program synthesis method that first predicts a discrete latent code from input/output examples, and then generates the program in the target language.
arXiv Detail & Related papers (2020-12-01T10:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.