MAGNET: A Multi-Graph Attentional Network for Code Clone Detection
- URL: http://arxiv.org/abs/2510.24241v1
- Date: Tue, 28 Oct 2025 09:48:06 GMT
- Title: MAGNET: A Multi-Graph Attentional Network for Code Clone Detection
- Authors: Zixian Zhang, Takfarinas Saber,
- Abstract summary: Code clone detection is a fundamental task in software engineering that underpins vulnerability detection, and plagiarism analysis.<n>Existing methods often rely on singular representations such as abstract syntax trees (ASTs), control flow graphs (CFGs), and data flow graphs (DFGs)<n>In this study, we propose MAGNET, a multi-graph attentional framework that jointly leverages AST, CFG, and DFG representations to capture syntactic and semantic features of source code.
- Score: 0.14323566945483496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Code clone detection is a fundamental task in software engineering that underpins refactoring, debugging, plagiarism detection, and vulnerability analysis. Existing methods often rely on singular representations such as abstract syntax trees (ASTs), control flow graphs (CFGs), and data flow graphs (DFGs), which capture only partial aspects of code semantics. Hybrid approaches have emerged, but their fusion strategies are typically handcrafted and ineffective. In this study, we propose MAGNET, a multi-graph attentional framework that jointly leverages AST, CFG, and DFG representations to capture syntactic and semantic features of source code. MAGNET integrates residual graph neural networks with node-level self-attention to learn both local and long-range dependencies, introduces a gated cross-attention mechanism for fine-grained inter-graph interactions, and employs Set2Set pooling to fuse multi-graph embeddings into unified program-level representations. Extensive experiments on BigCloneBench and Google Code Jam demonstrate that MAGNET achieves state-of-the-art performance with an overall F1 score of 96.5\% and 99.2\% on the two datasets, respectively. Ablation studies confirm the critical contributions of multi-graph fusion and each attentional component. Our code is available at https://github.com/ZixianReid/Multigraph_match
Related papers
- AST-Enhanced or AST-Overloaded? The Surprising Impact of Hybrid Graph Representations on Code Clone Detection [0.0]
Code clones significantly increase software maintenance costs and heighten vulnerability risks.<n>ASTs dominate deep learning-based code clone detection due to their precise syntactic structure representation.<n>Recent studies address this by enriching AST-based representations with semantic graphs.
arXiv Detail & Related papers (2025-06-17T12:35:17Z) - Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [79.75818239774952]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [55.21950038225407]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.<n>In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method.<n>Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z) - Meta-Path Based Attentional Graph Learning Model for Vulnerability
Detection [21.10614864296154]
We propose a Meta-path based Attentional Graph learning model for code vulNErability deTection, called MAGNET.
A meta-path based hierarchical attentional graph neural network is also proposed to capture the relations between distant nodes in the graph.
We evaluate MAGNET on three public datasets and the results show that MAGNET outperforms the best baseline method in terms of F1 score by 6.32%, 21.50%, and 25.40%, respectively.
arXiv Detail & Related papers (2022-12-29T11:51:47Z) - RU-Net: Regularized Unrolling Network for Scene Graph Generation [92.95032610978511]
Scene graph generation (SGG) aims to detect objects and predict the relationships between each pair of objects.
Existing SGG methods usually suffer from several issues, including 1) ambiguous object representations, and 2) low diversity in relationship predictions.
We propose a regularized unrolling network (RU-Net) to address both problems.
arXiv Detail & Related papers (2022-05-03T04:21:15Z) - MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs [55.66953093401889]
Masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data.
Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training.
arXiv Detail & Related papers (2022-01-07T16:48:07Z) - GN-Transformer: Fusing Sequence and Graph Representation for Improved
Code Summarization [0.0]
We propose a novel method, GN-Transformer, to learn end-to-end on a fused sequence and graph modality.
The proposed methods achieve state-of-the-art performance in two code summarization datasets and across three automatic code summarization metrics.
arXiv Detail & Related papers (2021-11-17T02:51:37Z) - Learning Multi-Granular Hypergraphs for Video-Based Person
Re-Identification [110.52328716130022]
Video-based person re-identification (re-ID) is an important research topic in computer vision.
We propose a novel graph-based framework, namely Multi-Granular Hypergraph (MGH) to better representational capabilities.
90.0% top-1 accuracy on MARS is achieved using MGH, outperforming the state-of-the-arts schemes.
arXiv Detail & Related papers (2021-04-30T11:20:02Z) - Detecting Code Clones with Graph Neural Networkand Flow-Augmented
Abstract Syntax Tree [30.484662671342935]
We build a graph representation of programs called flow-augmented abstract syntax tree (FA-AST)
We apply two different types of graph neural networks on FA-AST to measure the similarity of code pairs.
Our approach outperforms the state-of-the-art approaches on both Google Code Jam and BigCloneBench tasks.
arXiv Detail & Related papers (2020-02-20T10:18:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.