Context-Augmented Code Generation Using Programming Knowledge Graphs
- URL: http://arxiv.org/abs/2601.20810v1
- Date: Wed, 28 Jan 2026 17:58:30 GMT
- Title: Context-Augmented Code Generation Using Programming Knowledge Graphs
- Authors: Shahd Seddik, Fahd Seddik, Iman Saberi, Fatemeh Fard, Minh Hieu Huynh, Patanamon Thongtanunam,
- Abstract summary: Large Language Models (LLMs) excel at code generation but struggle with complex problems.<n>Retrieval-Augmented Generation (RAG) mitigates this issue by integrating external knowledge.<n>We propose Programming Knowledge Graph (PKG) for semantic representation and fine-grained retrieval of code and text.
- Score: 1.4367226581254677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) excel at code generation but struggle with complex problems. Retrieval-Augmented Generation (RAG) mitigates this issue by integrating external knowledge, yet retrieval models often miss relevant context, and generation models hallucinate with irrelevant data. We propose Programming Knowledge Graph (PKG) for semantic representation and fine-grained retrieval of code and text. Our approach enhances retrieval precision through tree pruning and mitigates hallucinations via a re-ranking mechanism that integrates non-RAG solutions. Structuring external data into finer-grained nodes improves retrieval granularity. Evaluations on HumanEval and MBPP show up to 20% pass@1 accuracy gains and a 34% improvement over baselines on MBPP. Our findings demonstrate that our proposed PKG approach along with re-ranker effectively address complex problems while maintaining minimal negative impact on solutions that are already correct without RAG. The replication package is published at https://github.com/iamshahd/ProgrammingKnowledgeGraph
Related papers
- Bringing Reasoning to Generative Recommendation Through the Lens of Cascaded Ranking [107.09842504618369]
Generative Recommendation (GR) has become a promising end-to-end approach with high FLOPS utilization for resource-efficient recommendation.<n>We show that current GR models suffer from a critical textbfbias amplification issue, where token-level bias escalates as token generation progresses.<n>To combat the bias amplification issue, it is crucial for GR to 1) incorporate more heterogeneous information, and 2) allocate greater computational resources at each token generation step.
arXiv Detail & Related papers (2026-02-03T16:10:54Z) - Graph-Anchored Knowledge Indexing for Retrieval-Augmented Generation [53.42323544075114]
We propose GraphAnchor, a novel Graph-Anchored Knowledge Indexing approach.<n> Experiments on four multi-hop question answering benchmarks demonstrate the effectiveness of GraphAnchor.
arXiv Detail & Related papers (2026-01-23T05:41:05Z) - TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework [62.66056331998838]
TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
arXiv Detail & Related papers (2025-11-07T16:08:34Z) - Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [79.75818239774952]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - Pseudo-Knowledge Graph: Meta-Path Guided Retrieval and In-Graph Text for RAG-Equipped LLM [8.941718961724984]
Pseudo-Knowledge Graph (PKG) framework integrates Meta-path Retrieval, In-graph Text and Vector Retrieval into Large Language Models.<n> PKG offers a richer knowledge representation and improves accuracy in information retrieval.
arXiv Detail & Related papers (2025-03-01T02:39:37Z) - Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training.
We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios.
Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z) - Context-Augmented Code Generation Using Programming Knowledge Graphs [0.0]
Large Language Models (LLMs) and Code-LLMs (CLLMs) frequently face difficulties when dealing with challenging and complex problems.<n>We present a novel framework that leverages a Programming Knowledge Graph (PKG) to semantically represent and retrieve code.
arXiv Detail & Related papers (2024-10-09T16:35:41Z) - Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering [0.0]
Large Language Models (LLM) and Knowledge Graphs (KG) are combined to improve the accuracy and reliability of question-answering systems.
Our method incorporates a query checker that ensures the syntactical and semantic validity of LLM-generated queries.
To make this approach accessible, a user-friendly web-based interface has been developed.
arXiv Detail & Related papers (2024-09-06T10:49:46Z) - Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation [7.071677694758966]
We introduce Mix-of-Granularity (MoG), a method that determines the optimal granularity of a knowledge source based on input queries using a router.<n>We extend MoG to MoG-Graph (MoGG), where reference documents are pre-processed as graphs, enabling the retrieval of distantly situated snippets.<n>Experiments demonstrate that MoG and MoGG effectively predict optimal granularity levels, significantly enhancing the performance of the RAG system in downstream tasks.
arXiv Detail & Related papers (2024-06-01T14:45:03Z) - Corrective Retrieval Augmented Generation [36.04062963574603]
Retrieval-augmented generation (RAG) relies heavily on relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong.
We propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation.
CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches.
arXiv Detail & Related papers (2024-01-29T04:36:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.