Related papers: GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization

GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization

URL: http://arxiv.org/abs/2111.08874v1
Date: Wed, 17 Nov 2021 02:51:37 GMT
Title: GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization
Authors: Junyan Cheng, Iordanis Fostiropoulos, and Barry Boehm
Abstract summary: We propose a novel method, GN-Transformer, to learn end-to-end on a fused sequence and graph modality. The proposed methods achieve state-of-the-art performance in two code summarization datasets and across three automatic code summarization metrics.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As opposed to natural languages, source code understanding is influenced by grammatical relationships between tokens regardless of their identifier name. Graph representations of source code such as Abstract Syntax Tree (AST) can capture relationships between tokens that are not obvious from the source code. We propose a novel method, GN-Transformer to learn end-to-end on a fused sequence and graph modality we call Syntax-Code-Graph (SCG). GN-Transformer expands on Graph Networks (GN) framework using a self-attention mechanism. SCG is the result of the early fusion between a source code snippet and the AST representation. We perform experiments on the structure of SCG, an ablation study on the model design, and the hyper-parameters to conclude that the performance advantage is from the fused representation. The proposed methods achieve state-of-the-art performance in two code summarization datasets and across three automatic code summarization metrics (BLEU, METEOR, ROUGE-L). We further evaluate the human perceived quality of our model and previous work with an expert-user study. Our model outperforms the state-of-the-art in human perceived quality and accuracy.

Related papers

AST-Enhanced or AST-Overloaded? The Surprising Impact of Hybrid Graph Representations on Code Clone Detection [0.0]
Code clones significantly increase software maintenance costs and heighten vulnerability risks.<n>ASTs dominate deep learning-based code clone detection due to their precise syntactic structure representation.<n>Recent studies address this by enriching AST-based representations with semantic graphs.
arXiv Detail & Related papers (2025-06-17T12:35:17Z)
Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [75.9865035064794]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z)
GNN-Coder: Boosting Semantic Code Retrieval with Combined GNNs and Transformer [15.991615273248804]
We introduce GNN-Coder, a novel framework based on Graph Neural Network (GNN) to utilize Abstract Syntax Tree (AST) GNN-Coder significantly boosts retrieval performance, with a 1%-10% improvement in MRR on the CSN dataset, and a notable 20% gain in zero-shot performance on the CosQA dataset.
arXiv Detail & Related papers (2025-02-21T04:29:53Z)
Efficient Graph Similarity Computation with Alignment Regularization [7.143879014059894]
Graph similarity computation (GSC) is a learning-based prediction task using Graph Neural Networks (GNNs) We show that high-quality learning can be attained with a simple yet powerful regularization technique, which we call the Alignment Regularization (AReg) In the inference stage, the graph-level representations learned by the GNN encoder are directly used to compute the similarity score without using AReg again to speed up inference.
arXiv Detail & Related papers (2024-06-21T07:37:28Z)
CONCORD: Towards a DSL for Configurable Graph Code Representation [3.756550107432323]
We introduce CONCORD, a domain-specific language to build customizable graph representations. We demonstrate its effectiveness in code smell detection as an illustrative use case. ConCORD will help researchers create and experiment with customizable graph-based code representations.
arXiv Detail & Related papers (2024-01-31T16:16:48Z)
Neural Machine Translation with Dynamic Graph Convolutional Decoder [32.462919670070654]
We propose an end-to-end translation architecture from the (graph & sequence) structural inputs to the (graph & sequence) outputs, where the target translation and its corresponding syntactic graph are jointly modeled and generated. We conduct extensive experiments on five widely acknowledged translation benchmarks, verifying our proposal achieves consistent improvements over baselines and other syntax-aware variants.
arXiv Detail & Related papers (2023-05-28T11:58:07Z)
Transforming Visual Scene Graphs to Image Captions [69.13204024990672]
We propose to transform Scene Graphs (TSG) into more descriptive captions. In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs. In TSG, each expert is built on MHA, for discriminating the graph embeddings to generate different kinds of words.
arXiv Detail & Related papers (2023-05-03T15:18:37Z)
Training Free Graph Neural Networks for Graph Matching [103.45755859119035]
TFGM is a framework to boost the performance of Graph Neural Networks (GNNs) based graph matching without training. Applying TFGM on various GNNs shows promising improvements over baselines.
arXiv Detail & Related papers (2022-01-14T09:04:46Z)
MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs [55.66953093401889]
Masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data. Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training.
arXiv Detail & Related papers (2022-01-07T16:48:07Z)
Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching. The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k) It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z)
deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search [15.19181807445119]
We propose a learnable deep Graph for Code Search (called deGraphCS) to transfer source code into variable-based flow graphs. We collect a large-scale dataset from GitHub containing 41,152 code snippets written in C language.
arXiv Detail & Related papers (2021-03-24T06:57:44Z)
A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens. We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)
Improved Code Summarization via a Graph Neural Network [96.03715569092523]
In general, source code summarization techniques use the source code as input and outputs a natural language description. We present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries.
arXiv Detail & Related papers (2020-04-06T17:36:42Z)
Self-Constructing Graph Convolutional Networks for Semantic Labeling [23.623276007011373]
We propose a novel architecture called the Self-Constructing Graph (SCG), which makes use of learnable latent variables to generate embeddings. SCG can automatically obtain optimized non-local context graphs from complex-shaped objects in aerial imagery. We demonstrate the effectiveness and flexibility of the proposed SCG on the publicly available ISPRS Vaihingen dataset.
arXiv Detail & Related papers (2020-03-15T21:55:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.