Related papers: SIT3: Code Summarization with Structure-Induced Transformer

SIT3: Code Summarization with Structure-Induced Transformer

URL: http://arxiv.org/abs/2012.14710v1
Date: Tue, 29 Dec 2020 11:37:43 GMT
Title: SIT3: Code Summarization with Structure-Induced Transformer
Authors: Hongqiu Wu and Hai Zhao and Min Zhang
Abstract summary: We propose a novel model based on structure-induced self-attention, which encodes sequential inputs with highly-effective structure modeling. Our newly-proposed model achieves new state-of-the-art results on popular benchmarks.
Score: 48.000063280183376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Code summarization (CS) is becoming a promising area in recent natural language understanding, which aims to generate sensible annotations automatically for source code and is known as programmer oriented. Previous works attempt to apply structure-based traversal (SBT) or non-sequential models like Tree-LSTM and GNN to learn structural program semantics. They both meet the following drawbacks: 1) it is shown ineffective to incorporate SBT into Transformer; 2) it is limited to capture global information through GNN; 3) it is underestimated to capture structural semantics only using Transformer. In this paper, we propose a novel model based on structure-induced self-attention, which encodes sequential inputs with highly-effective structure modeling. Extensive experiments show that our newly-proposed model achieves new state-of-the-art results on popular benchmarks. To our best knowledge, it is the first work on code summarization that uses Transformer to model structural information with high efficiency and no extra parameters. We also provide a tutorial on how we pre-process.

Related papers

Seamlessly Integrating Tree-Based Positional Embeddings into Transformer Models for Source Code Representation [0.0]
We propose a novel tree-based positional embedding approach that explicitly encodes hierarchical relationships derived from Abstract Syntax Trees (ASTs)<n>These hierarchical embeddings are integrated into the transformer architecture, specifically enhancing the CodeBERTa model.<n> Experimental results indicate that our Tree-Enhanced CodeBERTa consistently surpasses the baseline model in terms of loss, accuracy, F1 score, precision, and recall.
arXiv Detail & Related papers (2025-07-05T11:07:47Z)
Contextually Guided Transformers via Low-Rank Adaptation [14.702057924366345]
Large Language Models (LLMs) based on Transformers excel at text processing, but their reliance on prompts for specialized behavior introduces computational overhead.<n>We propose a modification to a Transformer architecture that eliminates the need for explicit prompts by learning to encode context into the model's weights.
arXiv Detail & Related papers (2025-06-06T01:34:39Z)
Efficient Point Transformer with Dynamic Token Aggregating for Point Cloud Processing [19.73918716354272]
We propose an efficient point TransFormer with Dynamic Token Aggregating (DTA-Former) for point cloud representation and processing. It achieves SOTA performance with up to 30$times$ faster than prior point Transformers on ModelNet40, ShapeNet, and airborne MultiSpectral LiDAR (MS-LiDAR) datasets.
arXiv Detail & Related papers (2024-05-23T20:50:50Z)
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models [86.75729087623259]
Recursion is a prominent feature of human language, and fundamentally challenging for self-attention. This work introduces Pushdown Layers, a new self-attention layer. Transformers equipped with Pushdown Layers achieve dramatically better and 3-5x more sample-efficient syntactic generalization.
arXiv Detail & Related papers (2023-10-29T17:27:18Z)
Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages [120.74406230847904]
TP-Transformer augments the traditional Transformer architecture to include an additional component to represent structure. The second method imbues structure at the data level by segmenting the data with morphological tokenization. We find that each of these two approaches allows the network to achieve better performance, but this improvement is dependent on the size of the dataset.
arXiv Detail & Related papers (2022-08-11T22:42:24Z)
Source Code Summarization with Structural Relative Position Guided Transformer [19.828300746504148]
Source code summarization aims at generating concise and clear natural language descriptions for programming languages. Recent efforts focus on incorporating the syntax structure of code into neural networks such as Transformer. We propose a Structural Relative Position guided Transformer, named SCRIPT.
arXiv Detail & Related papers (2022-02-14T07:34:33Z)
AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures. We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS. Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z)
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures [57.46093180685175]
We demonstrate a set of modifications to the structure of a Transformer layer, producing a more efficient architecture. We add a convolutional module to complement the self-attention module, decoupling the learning of local and global interactions. We apply the resulting architecture to language representation learning and demonstrate its superior performance compared to BERT models of different scales.
arXiv Detail & Related papers (2021-06-10T15:41:53Z)
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence [29.442579683405913]
The Transformer- Kernel (TK) model has demonstrated strong reranking performance on the TREC Deep Learning benchmark. A variant of the TK model -- called TKL -- has been developed that incorporates local self-attention to efficiently process longer input sequences. In this work, we propose a novel Conformer layer as an alternative approach to scale TK to longer input sequences.
arXiv Detail & Related papers (2021-04-19T15:32:34Z)
Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity. Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.