Related papers: Tutoring LLM into a Better CUDA Optimizer

Tutoring LLM into a Better CUDA Optimizer

URL: http://arxiv.org/abs/2510.16933v1
Date: Sun, 19 Oct 2025 17:09:15 GMT
Title: Tutoring LLM into a Better CUDA Optimizer
Authors: Matyáš Brabec, Jiří Klepl, Michal Töpfer, Martin Kruliš,
Abstract summary: We focus on the capabilities of the most recent reasoning models to generate optimized code for predefined, well-known tasks.<n>Our objective is to determine which types of code optimizations and parallel patterns the LLMs can perform by themselves and whether they can be improved by tutoring.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent leaps in large language models (LLMs) caused a revolution in programming tools (like GitHub Copilot) that can help with code generation, debugging, and even performance optimization. In this paper, we focus on the capabilities of the most recent reasoning models to generate optimized CUDA code for predefined, well-known tasks. Our objective is to determine which types of code optimizations and parallel patterns the LLMs can perform by themselves and whether they can be improved by tutoring (providing more detailed hints and guidelines in the prompt). The generated solutions were evaluated both automatically (for correctness and speedup) and manually (code reviews) to provide a more detailed perspective. We also tried an interactive approach where the LLM can fix its previous mistakes within a session. The results indicate that LLMs are quite skilled coders; however, they require tutoring to reach optimized solutions provided by parallel computing experts.

Related papers

From Large to Small: Transferring CUDA Optimization Expertise via Reasoning Graph [12.73098983668479]
Large language models (LLMs) show strong potential in generating optimized code from sequential code.<n>However, using LLMs in practice faces two major challenges: cloud-based APIs pose risks of code leakage, and local deployment is often computationally expensive and inefficient.<n>These drawbacks have spurred interest in small language models (SLMs), which are more lightweight and privacy-friendly.
arXiv Detail & Related papers (2025-10-22T08:33:44Z)
An Experimental Study of Real-Life LLM-Proposed Performance Improvements [2.503024366864326]
Large Language Models (LLMs) can generate code, but can they generate fast code?<n>We study this question using a dataset of 65 real-world tasks mined from open-source Java programs.
arXiv Detail & Related papers (2025-10-17T10:06:52Z)
Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces [9.880183350366792]
A key challenge in improving parallel program performance is efficiently mapping tasks to processors and data to memory.<n>We introduce a framework that automates mapper development with generative optimization.<n>Our approach finds mappers that surpass expert-written mappers by up to 1.34X speedup across nine benchmarks.
arXiv Detail & Related papers (2024-10-21T04:08:37Z)
Search-Based LLMs for Code Optimization [16.843870288512363]
Code written by developers usually suffers from efficiency problems and contain various performance bugs. Recent work regards the task as a sequence generation problem, and resorts to deep learning (DL) techniques such as large language models (LLMs) We propose a search-based LLMs framework named SBLLM that enables iterative refinement and discovery of improved optimization methods.
arXiv Detail & Related papers (2024-08-22T06:59:46Z)
Should AI Optimize Your Code? A Comparative Study of Classical Optimizing Compilers Versus Current Large Language Models [0.0]
Large Language Models (LLMs) raise intriguing questions about the potential of these AI approaches to revolutionize code optimization.<n>This work aims to answer an essential question for the compiler community: "Can AI-driven models revolutionize the way we approach code optimization?"<n>We present a comparative analysis between three classical optimizing compilers and two recent large language models.
arXiv Detail & Related papers (2024-06-17T23:26:41Z)
A Problem-Oriented Perspective and Anchor Verification for Code Optimization [43.28045750932116]
Large language models (LLMs) have shown remarkable capabilities in solving various programming tasks.<n>This paper investigates the capabilities of LLMs in optimizing code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z)
LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning [69.95292905263393]
We show that gradient-based and high-level LLMs can effectively collaborate a combined optimization framework.<n>In this paper, we show that these complementary to each other and can effectively collaborate a combined optimization framework.
arXiv Detail & Related papers (2024-05-30T06:24:14Z)
Are Large Language Models Good Prompt Optimizers? [65.48910201816223]
We conduct a study to uncover the actual mechanism of LLM-based Prompt Optimization. Our findings reveal that the LLMs struggle to identify the true causes of errors during reflection, tending to be biased by their own prior knowledge. We introduce a new "Automatic Behavior Optimization" paradigm, which directly optimize the target model's behavior in a more controllable manner.
arXiv Detail & Related papers (2024-02-03T09:48:54Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
LangProp: A code optimization framework using Large Language Models applied to driving [17.581983909703283]
LangProp is a framework for iteratively optimizing code generated by large language models (LLMs) We show how LangProp can generate interpretable and transparent policies that can be verified and improved in a metric- and data-driven way.
arXiv Detail & Related papers (2024-01-18T18:52:06Z)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs) It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks. Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
Large Language Models as Optimizers [106.52386531624532]
We propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as prompts. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values. We demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
arXiv Detail & Related papers (2023-09-07T00:07:15Z)
Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods. This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.