PennyLang: Pioneering LLM-Based Quantum Code Generation with a Novel PennyLane-Centric Dataset
- URL: http://arxiv.org/abs/2503.02497v2
- Date: Fri, 18 Apr 2025 07:46:25 GMT
- Title: PennyLang: Pioneering LLM-Based Quantum Code Generation with a Novel PennyLane-Centric Dataset
- Authors: Abdul Basit, Nouhaila Innan, Haider Asif, Minghao Shao, Muhammad Kashif, Alberto Marchisio, Muhammad Shafique,
- Abstract summary: Large Language Models (LLMs) offer remarkable capabilities in code generation, natural language processing, and domain-specific reasoning.<n>We introduce a novel, high-quality dataset comprising 3,347 PennyLane-specific quantum code samples and contextual descriptions.
- Score: 4.826802034066811
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) offer remarkable capabilities in code generation, natural language processing, and domain-specific reasoning. However, their application in quantum software development remains underexplored, particularly for PennyLane-a leading framework for hybrid quantum-classical computing. To address this gap, we introduce a novel, high-quality dataset comprising 3,347 PennyLane-specific quantum code samples and contextual descriptions, specifically curated to support LLM training and fine-tuning for quantum code assistance. Our contributions are threefold: (1) the automatic construction and open-source release of a comprehensive PennyLane dataset derived from textbooks, official documentation, and open-source repositories; (2) a structured methodology for data curation, annotation, and formatting to enhance LLM usability and relevance; and (3) a rigorous evaluation of code generation capabilities using both baseline Retrieval-Augmented Generation (RAG) and a GraphRAG-enhanced pipeline. Using the PennyLang framework, we demonstrate that GraphRAG, when applied to a GPT-4o Mini model, substantially outperforms standard prompting and baseline RAG. Accuracy improves from 20.5% (without RAG) to 58.2% with GraphRAG, showcasing its effectiveness in reducing hallucinations and improving code correctness in quantum programming tasks. Compared to prior efforts focused largely on Qiskit, our work expands LLM-based assistance to the PennyLane ecosystem, contributing practical tools and reproducible methodologies for advancing AI-assisted quantum software development.
Related papers
- Evaluating and Achieving Controllable Code Completion in Code LLM [89.64782747840225]
We present the first instruction-guided code completion benchmark, Controllable Code Completion Benchmark (C3-Bench)<n>We reveal substantial gaps in instruction-following capabilities between open-source and advanced proprietary models during code completion tasks.<n>The resulting model, Qwen2.5-Coder-C3, achieves state-of-the-art performance on C3-Bench.
arXiv Detail & Related papers (2026-01-22T11:40:04Z) - QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code [52.66657751895655]
Large Language Models (LLMs) offer a compelling new paradigm: Neural Compilation.<n>This paper introduces NeuComBack, a novel benchmark dataset specifically designed for IR-to-assembly compilation.<n>We propose a self-evolving prompt optimization method that enables LLMs to evolve their internal prompt strategies.
arXiv Detail & Related papers (2025-11-03T03:20:26Z) - PennyCoder: Efficient Domain-Specific LLMs for PennyLane-Based Quantum Code Generation [4.826802034066811]
PennyCoder is a novel framework for quantum code generation designed for local and embedded deployment.<n>Our approach emphasizes device-native operability while maintaining high model efficacy.<n>We rigorously evaluated PennyCoder over a comprehensive quantum programming dataset, achieving 44.3% accuracy.
arXiv Detail & Related papers (2025-07-25T12:02:49Z) - LIFT: LLM-Based Pragma Insertion for HLS via GNN Supervised Fine-Tuning [38.679497621876926]
LIFT is a large language model (LLM)-based coding assistant for HLS that automatically generates performance-critical pragmas.
We fine-tune the LLM by tightly integrating and supervising the training process with a graph neural network (GNN)
arXiv Detail & Related papers (2025-04-29T21:42:59Z) - Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't [0.0]
Our study investigates the potential of reinforcement learning to improve reasoning in small language models (LLMs)<n>Training on 4 NVIDIA A40 GPUs (48 GB VRAM each) within 24 hours resulted in rapid reasoning gains.<n>These findings highlight the efficacy of RL-based fine-tuning for small LLMs, offering a cost-effective alternative to large-scale approaches.
arXiv Detail & Related papers (2025-03-20T15:13:23Z) - Quantizing Large Language Models for Code Generation: A Differentiated Replication [51.85505914274633]
Large Language Models (LLMs) have shown an impressive capability in code generation and, specifically, to automatically implement requirements described in natural language.
LLMs pose significant challenges related to their memory (and, consequently, carbon) footprint.
New frontier for LLM quantization is 4-bit precision, resulting in an average memory footprint reduction of 70%.
arXiv Detail & Related papers (2025-03-10T09:26:08Z) - SnipGen: A Mining Repository Framework for Evaluating LLMs for Code [51.07471575337676]
Language Models (LLMs) are trained on extensive datasets that include code repositories.
evaluating their effectiveness poses significant challenges due to the potential overlap between the datasets used for training and those employed for evaluation.
We introduce SnipGen, a comprehensive repository mining framework designed to leverage prompt engineering across various downstream tasks for code generation.
arXiv Detail & Related papers (2025-02-10T21:28:15Z) - Resource-Efficient & Effective Code Summarization [3.512140256677132]
GreenAI techniques, such as QLoRA, offer a promising path for dealing with large models' sustainability.
Our study evaluates two state-of-the-art CLMs across two programming languages: Python and Java.
Results show that QLoRA enables efficient fine-tuning of CLMs for code summarization.
arXiv Detail & Related papers (2025-02-05T21:06:30Z) - Correctness Assessment of Code Generated by Large Language Models Using Internal Representations [4.32362000083889]
We introduce OPENIA, a novel framework to assess the correctness of code generated by Large Language Models (LLMs)
Our empirical analysis reveals that these internal representations encode latent information, which strongly correlates with the correctness of the generated code.
OPENIA consistently outperforms baseline models, achieving higher accuracy, precision, recall, and F1-Scores with up to a 2X improvement in standalone code generation.
arXiv Detail & Related papers (2025-01-22T15:04:13Z) - Open or Closed LLM for Lesser-Resourced Languages? Lessons from Greek [2.3499129784547663]
We evaluate the performance of open-source (Llama-70b) and closed-source (GPT-4o mini) large language models on seven core NLP tasks with dataset availability.<n>Second, we expand the scope of Greek NLP by reframing Authorship Attribution as a tool to assess potential data usage by LLMs in pre-training.<n>Third, we showcase a legal NLP case study, where a Summarize, Translate, and Embed (STE) methodology outperforms the traditional TF-IDF approach for clustering emphlong legal texts.
arXiv Detail & Related papers (2025-01-22T12:06:16Z) - OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [76.59316249991657]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems.
While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation [9.844598565914055]
Large Language Models (LLMs) demonstrate strong reasoning abilities but face limitations such as hallucinations and outdated knowledge.
We introduce SubgraphRAG, extending the Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) framework that retrieves subgraphs.
Our approach innovatively integrates a lightweight multilayer perceptron with a parallel triple-scoring mechanism for efficient and flexible subgraph retrieval.
arXiv Detail & Related papers (2024-10-28T04:39:32Z) - GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input.<n>GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak)
arXiv Detail & Related papers (2024-10-11T03:05:06Z) - Large Language Models as Code Executors: An Exploratory Study [29.545321608864295]
This paper pioneers the exploration of Large Language Models (LLMs) as code executors.
We are the first to examine this feasibility across various LLMs, including OpenAI's o1, GPT-4o, GPT-3.5, DeepSeek, and Qwen-Coder.
We introduce an Iterative Instruction Prompting (IIP) technique that processes code snippets line by line, enhancing the accuracy of weaker models by an average of 7.22%.
arXiv Detail & Related papers (2024-10-09T08:23:22Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models [1.8213213818713139]
We introduce and use the Qiskit HumanEval dataset to benchmark the ability of Large Language Models to produce quantum code.
This dataset consists of more than 100 quantum computing tasks, each accompanied by a prompt, a canonical solution, and a difficulty scale to evaluate the correctness of the generated solutions.
arXiv Detail & Related papers (2024-06-20T20:14:22Z) - Qiskit Code Assistant: Training LLMs for generating Quantum Computing Code [2.0108122340549985]
This paper focuses on training Code LLMs to specialize in the field of quantum computing.
A Code LLM specializing in quantum computing requires a foundational understanding of quantum computing and quantum information theory.
We discuss our work on training Code LLMs to produce high-quality quantum code using the Qiskit library.
arXiv Detail & Related papers (2024-05-29T20:21:00Z) - An empirical study of LLaMA3 quantization: from LLMs to MLLMs [54.91212829143966]
The LLaMA family is one of the most powerful open-source large language models (LLMs)<n>LLaMA3 models have achieved impressive performance in various domains with super-large scale pre-training on over 15T tokens of data.<n>We evaluate the 10 existing post-training quantization and LoRA fine-tuning (LoRA-FT) methods of LLaMA3 on 1-8 bits and various datasets to reveal the low-bit quantization performance of LLaMA3.
arXiv Detail & Related papers (2024-04-22T10:03:03Z) - Quantum Computing Enhanced Service Ecosystem for Simulation in Manufacturing [56.61654656648898]
We propose a framework for a quantum computing-enhanced service ecosystem for simulation in manufacturing.
We analyse two high-value use cases with the aim of a quantitative evaluation of these new computing paradigms for industrially-relevant settings.
arXiv Detail & Related papers (2024-01-19T11:04:14Z) - LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.
We build a novel data-cleaning pipeline that uses these principles to transform existing programs.
We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z) - Large Language Model-Aware In-Context Learning for Code Generation [75.68709482932903]
Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation.
We propose a novel learning-based selection approach named LAIL (LLM-Aware In-context Learning) for code generation.
arXiv Detail & Related papers (2023-10-15T06:12:58Z) - ShadowNet for Data-Centric Quantum System Learning [188.683909185536]
We propose a data-centric learning paradigm combining the strength of neural-network protocols and classical shadows.
Capitalizing on the generalization power of neural networks, this paradigm can be trained offline and excel at predicting previously unseen systems.
We present the instantiation of our paradigm in quantum state tomography and direct fidelity estimation tasks and conduct numerical analysis up to 60 qubits.
arXiv Detail & Related papers (2023-08-22T09:11:53Z) - Semi-definite programming and quantum information [0.0]
This paper presents a comprehensive exploration of semi-definite programming (SDP) techniques within the context of quantum information.
It examines the mathematical foundations of convex optimization, duality, and SDP formulations.
By leveraging these tools, researchers and practitioners can characterize classical and quantum correlations, optimize quantum states, and design efficient quantum algorithms and protocols.
arXiv Detail & Related papers (2023-06-28T21:02:06Z) - On exploring the potential of quantum auto-encoder for learning quantum systems [60.909817434753315]
We devise three effective QAE-based learning protocols to address three classically computational hard learning problems.
Our work sheds new light on developing advanced quantum learning algorithms to accomplish hard quantum physics and quantum information processing tasks.
arXiv Detail & Related papers (2021-06-29T14:01:40Z) - Quantum Federated Learning with Quantum Data [87.49715898878858]
Quantum machine learning (QML) has emerged as a promising field that leans on the developments in quantum computing to explore large complex machine learning problems.
This paper proposes the first fully quantum federated learning framework that can operate over quantum data and, thus, share the learning of quantum circuit parameters in a decentralized manner.
arXiv Detail & Related papers (2021-05-30T12:19:27Z) - Quantum Annealing for Semi-Supervised Learning [5.714334716737985]
Semi-supervised learning is a machine learning technique that makes use of both labeled and unlabeled data for training.
We propose and theoretically analyze a graph-based semi-supervised learning method with the aid of the quantum annealing technique.
We illustrate two classification examples, suggesting the feasibility of this method even with a small portion (20%) of labeled data is involved.
arXiv Detail & Related papers (2020-03-27T15:09:44Z) - Benchmarking Graph Neural Networks [75.42159546060509]
Graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs.
For any successful field to become mainstream and reliable, benchmarks must be developed to quantify progress.
GitHub repository has reached 1,800 stars and 339 forks, which demonstrates the utility of the proposed open-source framework.
arXiv Detail & Related papers (2020-03-02T15:58:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.