Empirical Study of Transformers for Source Code
- URL: http://arxiv.org/abs/2010.07987v2
- Date: Thu, 24 Jun 2021 11:32:30 GMT
- Title: Empirical Study of Transformers for Source Code
- Authors: Nadezhda Chirkova, Sergey Troshin
- Abstract summary: We study the capabilities of Transformers to utilize syntactic information in different tasks.
We show that Transformers are able to make meaningful predictions based purely on syntactic information.
- Score: 14.904366372190943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Initially developed for natural language processing (NLP), Transformers are
now widely used for source code processing, due to the format similarity
between source code and text. In contrast to natural language, source code is
strictly structured, i.e., it follows the syntax of the programming language.
Several recent works develop Transformer modifications for capturing syntactic
information in source code. The drawback of these works is that they do not
compare to each other and consider different tasks. In this work, we conduct a
thorough empirical study of the capabilities of Transformers to utilize
syntactic information in different tasks. We consider three tasks (code
completion, function naming and bug fixing) and re-implement different
syntax-capturing modifications in a unified framework. We show that
Transformers are able to make meaningful predictions based purely on syntactic
information and underline the best practices of taking the syntactic
information into account for improving the performance of the model.
Related papers
- Transformers are Efficient Compilers, Provably [11.459397066286822]
Transformer-based large language models (LLMs) have demonstrated surprisingly robust performance across a wide range of language-related tasks.
In this paper, we take the first steps towards a formal investigation of using transformers as compilers from an expressive power perspective.
We introduce a representative programming language, Mini-Husky, which encapsulates key features of modern C-like languages.
arXiv Detail & Related papers (2024-10-07T20:31:13Z) - Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized.
We find that these random transformers can perform a wide range of meaningful algorithmic tasks.
Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z) - Understanding Code Semantics: An Evaluation of Transformer Models in
Summarization [0.0]
We evaluate the efficacy of code summarization by altering function and variable names.
We introduce adversaries like dead code and commented code across three programming languages.
arXiv Detail & Related papers (2023-10-25T02:41:50Z) - Learning Transformer Programs [78.9509560355733]
We introduce a procedure for training Transformers that are mechanistically interpretable by design.
Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization.
The Transformer Programs can automatically find reasonable solutions, performing on par with standard Transformers of comparable size.
arXiv Detail & Related papers (2023-06-01T20:27:01Z) - Which Features are Learned by CodeBert: An Empirical Study of the
BERT-based Source Code Representation Learning [9.469346910848733]
We show that current methods cannot effectively understand the logic of source codes.
The representation of source code heavily relies on the programmer-defined variable and function names.
arXiv Detail & Related papers (2023-01-20T05:39:26Z) - Source Code Summarization with Structural Relative Position Guided
Transformer [19.828300746504148]
Source code summarization aims at generating concise and clear natural language descriptions for programming languages.
Recent efforts focus on incorporating the syntax structure of code into neural networks such as Transformer.
We propose a Structural Relative Position guided Transformer, named SCRIPT.
arXiv Detail & Related papers (2022-02-14T07:34:33Z) - On the validity of pre-trained transformers for natural language
processing in the software engineering domain [78.32146765053318]
We compare BERT transformer models trained with software engineering data with transformers based on general domain data.
Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable.
arXiv Detail & Related papers (2021-09-10T08:46:31Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z) - Text Compression-aided Transformer Encoding [77.16960983003271]
We propose explicit and implicit text compression approaches to enhance the Transformer encoding.
backbone information, meaning the gist of the input text, is not specifically focused on.
Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.
arXiv Detail & Related papers (2021-02-11T11:28:39Z) - Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in
Multitask End-to-End Speech Translation [127.54315184545796]
Speech translation (ST) aims to learn transformations from speech in the source language to the text in the target language.
We propose to improve the multitask ST model by utilizing word embedding as the intermediate.
arXiv Detail & Related papers (2020-05-21T14:22:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.