IssueCourier: Multi-Relational Heterogeneous Temporal Graph Neural Network for Open-Source Issue Assignment
- URL: http://arxiv.org/abs/2505.11205v3
- Date: Tue, 10 Jun 2025 07:57:26 GMT
- Title: IssueCourier: Multi-Relational Heterogeneous Temporal Graph Neural Network for Open-Source Issue Assignment
- Authors: Chunying Zhou, Xiaoyuan Xie, Gong Chen, Peng He, Bing Li,
- Abstract summary: Issue assignment plays a critical role in open-source software (OSS) maintenance.<n>We propose IssueCourier, a novel Multi-Relational Heterogeneous Temporal Graph Neural Network approach for issue assignment.<n>We show that IssueCourier can improve over the best baseline up to 45.49% in top-1 and 31.97% in MRR.
- Score: 5.1987901165589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Issue assignment plays a critical role in open-source software (OSS) maintenance, which involves recommending the most suitable developers to address the reported issues. Given the high volume of issue reports in large-scale projects, manually assigning issues is tedious and costly. Previous studies have proposed automated issue assignment approaches that primarily focus on modeling issue report textual information, developers' expertise, or interactions between issues and developers based on historical issue-fixing records. However, these approaches often suffer from performance limitations due to the presence of incorrect and missing labels in OSS datasets, as well as the long tail of developer contributions and the changes of developer activity as the project evolves. To address these challenges, we propose IssueCourier, a novel Multi-Relational Heterogeneous Temporal Graph Neural Network approach for issue assignment. Specifically, we formalize five key relationships among issues, developers, and source code files to construct a heterogeneous graph. Then, we further adopt a temporal slicing technique that partitions the graph into a sequence of time-based subgraphs to learn stage-specific patterns. Furthermore, we provide a benchmark dataset with relabeled ground truth to address the problem of incorrect and missing labels in existing OSS datasets. Finally, to evaluate the performance of IssueCourier, we conduct extensive experiments on our benchmark dataset. The results show that IssueCourier can improve over the best baseline up to 45.49% in top-1 and 31.97% in MRR.
Related papers
- On Measuring Long-Range Interactions in Graph Neural Networks [24.974333602585368]
Long-range graph tasks are an open problem in graph neural network research.<n>We introduce a range measure for operators on graphs, and validate it with synthetic experiments.
arXiv Detail & Related papers (2025-06-06T10:48:30Z) - Towards an Interpretable Analysis for Estimating the Resolution Time of Software Issues [1.4039240369201997]
We build an issue monitoring system that extracts the actual effort required to fix issues on a per-project basis.<n>Our approach employs topic modeling to capture issue semantics and leverages metadata for interpretable resolution time analysis.
arXiv Detail & Related papers (2025-05-02T08:38:59Z) - Automated Bug Report Prioritization in Large Open-Source Projects [3.9134031118910264]
We propose a novel approach to automated bug prioritization based on the natural language text of the bug reports.<n>We conduct topic modeling using a variant of LDA called TopicMiner-MTM and text classification with the BERT large language model.<n> Experimental results using an existing reference dataset containing 85,156 bug reports of the Eclipse Platform project indicate that we outperform existing approaches in terms of Accuracy, Precision, Recall, and F1-measure of the bug report priority prediction.
arXiv Detail & Related papers (2025-04-22T13:57:48Z) - Graph-based Approaches and Functionalities in Retrieval-Augmented Generation: A Comprehensive Survey [15.60128530639056]
Large language models (LLMs) struggle with the factual error during inference due to the lack of sufficient training data and the most updated knowledge.<n>Retrieval-Augmented Generation (RAG) has gained attention as a promising solution to address the limitation of LLMs.<n>This survey offers a novel perspective on the functionality of graphs within RAG and their impact on enhancing performance.
arXiv Detail & Related papers (2025-04-08T03:52:05Z) - Instance-Aware Graph Prompt Learning [71.26108600288308]
We introduce Instance-Aware Graph Prompt Learning (IA-GPL) in this paper.
The process involves generating intermediate prompts for each instance using a lightweight architecture.
Experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-26T18:38:38Z) - Federated Neural Graph Databases [53.03085605769093]
We propose Federated Neural Graph Database (FedNGDB), a novel framework that enables reasoning over multi-source graph-based data while preserving privacy.
Unlike existing methods, FedNGDB can handle complex graph structures and relationships, making it suitable for various downstream tasks.
arXiv Detail & Related papers (2024-02-22T14:57:44Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - Supporting the Task-driven Skill Identification in Open Source Project
Issue Tracking Systems [0.0]
We investigate the automatic labeling of open issues strategy to help the contributors to pick a task to contribute.
By identifying the skills, we claim the contributor candidates should pick a task more suitable.
We applied quantitative studies to analyze the relevance of the labels in an experiment and compare the strategies' relative importance.
arXiv Detail & Related papers (2022-11-02T14:17:22Z) - GeoQA: A Geometric Question Answering Benchmark Towards Multimodal
Numerical Reasoning [172.36214872466707]
We focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge.
We propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs.
arXiv Detail & Related papers (2021-05-30T12:34:17Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.