Graph Conditioned Sparse-Attention for Improved Source Code
Understanding
- URL: http://arxiv.org/abs/2112.00663v2
- Date: Fri, 3 Dec 2021 17:28:40 GMT
- Title: Graph Conditioned Sparse-Attention for Improved Source Code
Understanding
- Authors: Junyan Cheng, Iordanis Fostiropoulos and Barry Boehm
- Abstract summary: We propose the conditioning of a source code snippet with its graph modality by using the graph adjacency matrix as an attention mask for a sparse self-attention mechanism.
Our model reaches state-of-the-art results in BLEU, METEOR, and ROUGE-L metrics for the code summarization task and near state-of-the-art accuracy in the variable misuse task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformer architectures have been successfully used in learning source code
representations. The fusion between a graph representation like Abstract Syntax
Tree (AST) and a source code sequence makes the use of current approaches
computationally intractable for large input sequence lengths. Source code can
have long-range dependencies that require larger sequence lengths to model
effectively. Current approaches have a quadratic growth in computational and
memory costs with respect to the sequence length. Using such models in
practical scenarios is difficult. In this work, we propose the conditioning of
a source code snippet with its graph modality by using the graph adjacency
matrix as an attention mask for a sparse self-attention mechanism and the use
of a graph diffusion mechanism to model longer-range token dependencies. Our
model reaches state-of-the-art results in BLEU, METEOR, and ROUGE-L metrics for
the code summarization task and near state-of-the-art accuracy in the variable
misuse task. The memory use and inference time of our model have linear growth
with respect to the input sequence length as compared to the quadratic growth
of previous works.
Related papers
- Learning Long Range Dependencies on Graphs via Random Walks [6.7864586321550595]
Message-passing graph neural networks (GNNs) excel at capturing local relationships but struggle with long-range dependencies in graphs.
graph transformers (GTs) enable global information exchange but often oversimplify the graph structure by representing graphs as sets of fixed-length vectors.
This work introduces a novel architecture that overcomes the shortcomings of both approaches by combining the long-range information of random walks with local message passing.
arXiv Detail & Related papers (2024-06-05T15:36:57Z) - Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass.
In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z) - LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory [63.41820940103348]
Self-attention mechanism's computational cost limits its practicality for long sequences.
We propose a new method called LongVQ to compress the global abstraction as a length-fixed codebook.
LongVQ effectively maintains dynamic global and local patterns, which helps to complement the lack of long-range dependency issues.
arXiv Detail & Related papers (2024-04-17T08:26:34Z) - Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective
State Spaces [4.928791850200171]
We introduce Graph-Mamba, the first attempt to enhance long-range context modeling in graph networks.
We formulate graph-centric node prioritization and permutation strategies to enhance context-aware reasoning.
Experiments on ten benchmark datasets demonstrate that Graph-Mamba outperforms state-of-the-art methods in long-range graph prediction tasks.
arXiv Detail & Related papers (2024-02-01T17:21:53Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Gransformer: Transformer-based Graph Generation [14.161975556325796]
Gransformer is an algorithm based on Transformer for generating graphs.
We modify the Transformer encoder to exploit the structural information of the given graph.
We also introduce a graph-based familiarity measure between node pairs.
arXiv Detail & Related papers (2022-03-25T14:05:12Z) - GN-Transformer: Fusing Sequence and Graph Representation for Improved
Code Summarization [0.0]
We propose a novel method, GN-Transformer, to learn end-to-end on a fused sequence and graph modality.
The proposed methods achieve state-of-the-art performance in two code summarization datasets and across three automatic code summarization metrics.
arXiv Detail & Related papers (2021-11-17T02:51:37Z) - Auto-decoding Graphs [91.3755431537592]
The generative model is an auto-decoder that learns to synthesize graphs from latent codes.
Graphs are synthesized using self-attention modules that are trained to identify likely connectivity patterns.
arXiv Detail & Related papers (2020-06-04T14:23:01Z) - A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens.
We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.