Related papers: Empirical Study of Transformers for Source Code

Empirical Study of Transformers for Source Code

URL: http://arxiv.org/abs/2010.07987v2
Date: Thu, 24 Jun 2021 11:32:30 GMT
Title: Empirical Study of Transformers for Source Code
Authors: Nadezhda Chirkova, Sergey Troshin
Abstract summary: We study the capabilities of Transformers to utilize syntactic information in different tasks. We show that Transformers are able to make meaningful predictions based purely on syntactic information.
Score: 14.904366372190943
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Initially developed for natural language processing (NLP), Transformers are now widely used for source code processing, due to the format similarity between source code and text. In contrast to natural language, source code is strictly structured, i.e., it follows the syntax of the programming language. Several recent works develop Transformer modifications for capturing syntactic information in source code. The drawback of these works is that they do not compare to each other and consider different tasks. In this work, we conduct a thorough empirical study of the capabilities of Transformers to utilize syntactic information in different tasks. We consider three tasks (code completion, function naming and bug fixing) and re-implement different syntax-capturing modifications in a unified framework. We show that Transformers are able to make meaningful predictions based purely on syntactic information and underline the best practices of taking the syntactic information into account for improving the performance of the model.

Related papers

Transformers are Efficient Compilers, Provably [11.459397066286822]
Transformer-based large language models (LLMs) have demonstrated surprisingly robust performance across a wide range of language-related tasks. In this paper, we take the first steps towards a formal investigation of using transformers as compilers from an expressive power perspective. We introduce a representative programming language, Mini-Husky, which encapsulates key features of modern C-like languages.
arXiv Detail & Related papers (2024-10-07T20:31:13Z)
Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized. We find that these random transformers can perform a wide range of meaningful algorithmic tasks. Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z)
AlgoFormer: An Efficient Transformer Framework with Algorithmic Structures [80.28359222380733]
We design a novel transformer framework, dubbed AlgoFormer, to empower transformers with algorithmic capabilities. In particular, inspired by the structure of human-designed learning algorithms, our transformer framework consists of a pre-transformer that is responsible for task preprocessing. Some theoretical and empirical results are presented to show that the designed transformer has the potential to perform algorithm representation and learning.
arXiv Detail & Related papers (2024-02-21T07:07:54Z)
Understanding Code Semantics: An Evaluation of Transformer Models in Summarization [0.0]
We evaluate the efficacy of code summarization by altering function and variable names. We introduce adversaries like dead code and commented code across three programming languages.
arXiv Detail & Related papers (2023-10-25T02:41:50Z)
Learning Transformer Programs [78.9509560355733]
We introduce a procedure for training Transformers that are mechanistically interpretable by design. Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization. The Transformer Programs can automatically find reasonable solutions, performing on par with standard Transformers of comparable size.
arXiv Detail & Related papers (2023-06-01T20:27:01Z)
Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning [9.469346910848733]
We show that current methods cannot effectively understand the logic of source codes. The representation of source code heavily relies on the programmer-defined variable and function names.
arXiv Detail & Related papers (2023-01-20T05:39:26Z)
Source Code Summarization with Structural Relative Position Guided Transformer [19.828300746504148]
Source code summarization aims at generating concise and clear natural language descriptions for programming languages. Recent efforts focus on incorporating the syntax structure of code into neural networks such as Transformer. We propose a Structural Relative Position guided Transformer, named SCRIPT.
arXiv Detail & Related papers (2022-02-14T07:34:33Z)
On the validity of pre-trained transformers for natural language processing in the software engineering domain [78.32146765053318]
We compare BERT transformer models trained with software engineering data with transformers based on general domain data. Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable.
arXiv Detail & Related papers (2021-09-10T08:46:31Z)
Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language. We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer. We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z)
Text Compression-aided Transformer Encoding [77.16960983003271]
We propose explicit and implicit text compression approaches to enhance the Transformer encoding. backbone information, meaning the gist of the input text, is not specifically focused on. Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.
arXiv Detail & Related papers (2021-02-11T11:28:39Z)
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation [127.54315184545796]
Speech translation (ST) aims to learn transformations from speech in the source language to the text in the target language. We propose to improve the multitask ST model by utilizing word embedding as the intermediate.
arXiv Detail & Related papers (2020-05-21T14:22:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.