Related papers: Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better Language Models for Code Understanding

Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better Language Models for Code Understanding

URL: http://arxiv.org/abs/2109.07452v1
Date: Wed, 15 Sep 2021 17:42:44 GMT
Title: Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better Language Models for Code Understanding
Authors: Ibrahim Abdelaziz, Julian Dolby, Jamie McCusker, and Kavitha Srinivas
Abstract summary: We derive a set of benchmarks that assess code understanding based on tasks such as predicting the best answer to a question in a forum post. We evaluate the performance of current state-of-the-art language models on these tasks and show that there is a significant improvement on each task from fine tuning.
Score: 3.98345038769576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Code understanding is an increasingly important application of Artificial Intelligence. A fundamental aspect of understanding code is understanding text about code, e.g., documentation and forum discussions. Pre-trained language models (e.g., BERT) are a popular approach for various NLP tasks, and there are now a variety of benchmarks, such as GLUE, to help improve the development of such models for natural language understanding. However, little is known about how well such models work on textual artifacts about code, and we are unaware of any systematic set of downstream tasks for such an evaluation. In this paper, we derive a set of benchmarks (BLANCA - Benchmarks for LANguage models on Coding Artifacts) that assess code understanding based on tasks such as predicting the best answer to a question in a forum post, finding related forum posts, or predicting classes related in a hierarchy from class documentation. We evaluate the performance of current state-of-the-art language models on these tasks and show that there is a significant improvement on each task from fine tuning. We also show that multi-task training over BLANCA tasks helps build better language models for code understanding.

Related papers

CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs. CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language. Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates [32.93686693752635]
Large Language Models' success on text generation has also made them better at code generation and coding tasks. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs.
arXiv Detail & Related papers (2024-02-08T06:48:01Z)
Python Code Generation by Asking Clarification Questions [57.63906360576212]
In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions. We collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers.
arXiv Detail & Related papers (2022-12-19T22:08:36Z)
Adding Context to Source Code Representations for Deep Learning [13.676416860721877]
We argue that it is beneficial for deep learning models to have access to additional contextual information about the code being analysed. We present preliminary evidence that encoding context from the call hierarchy along with information from the code itself can improve the performance of a state-of-the-art deep learning model.
arXiv Detail & Related papers (2022-07-30T12:47:32Z)
Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code [13.15617135394116]
Few-shot learning with large-scale, pre-trained language models is a powerful way to answer questions about code. This paper studies to what extent a state-of-the-art, pre-trained language model of code, Codex, may serve this purpose.
arXiv Detail & Related papers (2022-06-02T23:15:42Z)
Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks [95.06087720086133]
Natural-Instructions v2 is a collection of 1,600+ diverse language tasks and their expert written instructions. The benchmark covers 70+ distinct task types, such as tagging, in-filling, and rewriting. This benchmark enables large-scale evaluation of cross-task generalization of the models.
arXiv Detail & Related papers (2022-04-16T03:12:30Z)
CodeRetriever: Unimodal and Bimodal Contrastive Learning [128.06072658302165]
We propose the CodeRetriever model, which combines the unimodal and bimodal contrastive learning to train function-level code semantic representations. For unimodal contrastive learning, we design a semantic-guided method to build positive code pairs based on the documentation and function name. For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build text-code pairs.
arXiv Detail & Related papers (2022-01-26T10:54:30Z)
Contrastive Learning for Source Code with Structural and Functional Properties [66.10710134948478]
We present BOOST, a novel self-supervised model to focus pre-training based on the characteristics of source code. We employ automated, structure-guided code transformation algorithms that generate functionally equivalent code that looks drastically different from the original one. We train our model in a way that brings the functionally equivalent code closer and distinct code further through a contrastive learning objective.
arXiv Detail & Related papers (2021-10-08T02:56:43Z)
CLSEBERT: Contrastive Learning for Syntax Enhanced Code Pre-Trained Model [23.947178895479464]
We propose CLSEBERT, a Constrastive Learning Framework for Syntax Enhanced Code Pre-Trained Model. In the pre-training stage, we consider the code syntax and hierarchy contained in the Abstract Syntax Tree (AST) We also introduce two novel pre-training objectives. One is to predict the edges between nodes in the abstract syntax tree, and the other is to predict the types of code tokens.
arXiv Detail & Related papers (2021-08-10T10:08:21Z)
BERT2Code: Can Pretrained Language Models be Leveraged for Code Search? [0.7953229555481884]
We show that our model learns the inherent relationship between the embedding spaces and further probes into the scope of improvement. In this analysis, we show that the quality of the code embedding model is the bottleneck for our model's performance.
arXiv Detail & Related papers (2021-04-16T10:28:27Z)
GraphCodeBERT: Pre-training Code Representations with Data Flow [97.00641522327699]
We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code. We use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables. We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement.
arXiv Detail & Related papers (2020-09-17T15:25:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.