Related papers: Towards Better Graph Neural Network-based Fault Localization Through Enhanced Code Representation

Towards Better Graph Neural Network-based Fault Localization Through Enhanced Code Representation

URL: http://arxiv.org/abs/2404.04496v5
Date: Tue, 30 Apr 2024 13:42:44 GMT
Title: Towards Better Graph Neural Network-based Fault Localization Through Enhanced Code Representation
Authors: Md Nakhla Rafi, Dong Jae Kim, An Ran Chen, Tse-Hsun Chen, Shaowei Wang,
Abstract summary: We propose a new graph representation, DepGraph, that reduces the complexity of the graph representation by 70% in nodes and edges. We evaluate DepGraph using Defects4j 2.0.0, and it outperforms Grace by locating 20% more faults in Top-1 and improving the Mean First Rank (MFR) and the Mean Average Rank (MAR) by over 50%.
Score: 8.647406441990396
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automatic software fault localization plays an important role in software quality assurance by pinpointing faulty locations for easier debugging. Coverage-based fault localization, a widely used technique, employs statistics on coverage spectra to rank code based on suspiciousness scores. However, the rigidity of statistical approaches calls for learning-based techniques. Amongst all, Grace, a graph-neural network (GNN) based technique has achieved state-of-the-art due to its capacity to preserve coverage spectra, i.e., test-to-source coverage relationships, as precise abstract syntax-enhanced graph representation, mitigating the limitation of other learning-based technique which compresses the feature representation. However, such representation struggles with scalability due to the increasing complexity of software and associated coverage spectra and AST graphs. In this work, we proposed a new graph representation, DepGraph, that reduces the complexity of the graph representation by 70% in nodes and edges by integrating interprocedural call graph in the graph representation of the code. Moreover, we integrate additional features such as code change information in the graph as attributes so the model can leverage rich historical project data. We evaluate DepGraph using Defects4j 2.0.0, and it outperforms Grace by locating 20% more faults in Top-1 and improving the Mean First Rank (MFR) and the Mean Average Rank (MAR) by over 50% while decreasing GPU memory usage by 44% and training/inference time by 85%. Additionally, in cross-project settings, DepGraph surpasses the state-of-the-art baseline with a 42% higher Top-1 accuracy, and 68% and 65% improvement in MFR and MAR, respectively. Our study demonstrates DepGraph's robustness, achieving state-of-the-art accuracy and scalability for future extension and adoption.

Related papers

GRAIN: Exact Graph Reconstruction from Gradients [5.697251900862886]
Federated learning claims to enable collaborative model training among multiple clients with data privacy. Recent studies have shown the client privacy is still at risk due to the, so called, gradient inversion attacks. We present GRAIN, the first exact gradient inversion attack on graph data in the honest-but-curious setting.
arXiv Detail & Related papers (2025-03-03T18:58:12Z)
Keep It Simple: Towards Accurate Vulnerability Detection for Large Code Graphs [6.236203127696138]
We propose a novel vulnerability detection method, ANGLE, which embodies the hierarchical graph refinement and context-aware graph representation learning. Our method significantly outperforms several other baselines in terms of the accuracy and F1 score. In large code graphs, ANGLE achieves an improvement in accuracy of 34.27%-161.93% compared to the state-of-the-art method, AMPLE.
arXiv Detail & Related papers (2024-12-13T14:27:51Z)
GALA: Graph Diffusion-based Alignment with Jigsaw for Source-free Domain Adaptation [13.317620250521124]
Source-free domain adaptation is a crucial machine learning topic, as it contains numerous applications in the real world. Recent graph neural network (GNN) approaches can suffer from serious performance decline due to domain shift and label scarcity. We propose a novel method named Graph Diffusion-based Alignment with Jigsaw (GALA), tailored for source-free graph domain adaptation.
arXiv Detail & Related papers (2024-10-22T01:32:46Z)
Chasing Fairness in Graphs: A GNN Architecture Perspective [73.43111851492593]
We propose textsfFair textsfMessage textsfPassing (FMP) designed within a unified optimization framework for graph neural networks (GNNs) In FMP, the aggregation is first adopted to utilize neighbors' information and then the bias mitigation step explicitly pushes demographic group node presentation centers together. Experiments on node classification tasks demonstrate that the proposed FMP outperforms several baselines in terms of fairness and accuracy on three real-world datasets.
arXiv Detail & Related papers (2023-12-19T18:00:15Z)
Feature propagation as self-supervision signals on graphs [0.0]
Regularized Graph Infomax (RGI) is a simple yet effective framework for node level self-supervised learning. We show that RGI can achieve state-of-the-art performance regardless of its simplicity.
arXiv Detail & Related papers (2023-03-15T14:20:06Z)
Features Based Adaptive Augmentation for Graph Contrastive Learning [0.0]
Self-Supervised learning aims to eliminate the need for expensive annotation in graph representation learning. We introduce a Feature Based Adaptive Augmentation (FebAA) approach, which identifies and preserves potentially influential features. We successfully improved the accuracy of GRACE and BGRL on eight graph representation learning's benchmark datasets.
arXiv Detail & Related papers (2022-07-05T03:41:20Z)
Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure. We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z)
GraphCoCo: Graph Complementary Contrastive Learning [65.89743197355722]
Graph Contrastive Learning (GCL) has shown promising performance in graph representation learning (GRL) without the supervision of manual annotations. This paper proposes an effective graph complementary contrastive learning approach named GraphCoCo to tackle the above issue.
arXiv Detail & Related papers (2022-03-24T02:58:36Z)
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT) GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information. We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z)
GraphMI: Extracting Private Graph Data from Graph Neural Networks [59.05178231559796]
We present textbfGraph textbfModel textbfInversion attack (GraphMI), which aims to extract private graph data of the training graph by inverting GNN. Specifically, we propose a projected gradient module to tackle the discreteness of graph edges while preserving the sparsity and smoothness of graph features. We design a graph auto-encoder module to efficiently exploit graph topology, node attributes, and target model parameters for edge inference.
arXiv Detail & Related papers (2021-06-05T07:07:52Z)
Hierarchical Adaptive Pooling by Capturing High-order Dependency for Graph Representation Learning [18.423192209359158]
Graph neural networks (GNN) have been proven to be mature enough for handling graph-structured data on node-level graph representation learning tasks. This paper proposes a hierarchical graph-level representation learning framework, which is adaptively sensitive to graph structures.
arXiv Detail & Related papers (2021-04-13T06:22:24Z)
Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning [21.0019144298605]
Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs. textscSubg-Con is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information. Compared with existing graph representation learning approaches, textscSubg-Con has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization.
arXiv Detail & Related papers (2020-09-22T01:58:19Z)
Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.