Related papers: SysLLMatic: Large Language Models are Software System Optimizers

SysLLMatic: Large Language Models are Software System Optimizers

URL: http://arxiv.org/abs/2506.01249v1
Date: Mon, 02 Jun 2025 01:57:21 GMT
Title: SysLLMatic: Large Language Models are Software System Optimizers
Authors: Huiyun Peng, Arjun Gupte, Ryan Hasler, Nicholas John Eliopoulos, Chien-Chou Ho, Rishi Mantri, Leo Deng, Konstantin Läufer, George K. Thiruvathukal, James C. Davis,
Abstract summary: We present SysLLMatic, a system that integrates Large Language Models with profiling-guided feedback and system performance insights.<n>We evaluate it on three benchmark suites: HumanEval_Bench (competitive programming in C++), SciMark2 (scientific kernels in Java), and DaCapoBench (large-scale software systems in Java)
Score: 2.4416377721219145
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automatic software system optimization can improve software speed, reduce operating costs, and save energy. Traditional approaches to optimization rely on manual tuning and compiler heuristics, limiting their ability to generalize across diverse codebases and system contexts. Recent methods using Large Language Models (LLMs) offer automation to address these limitations, but often fail to scale to the complexity of real-world software systems and applications. We present SysLLMatic, a system that integrates LLMs with profiling-guided feedback and system performance insights to automatically optimize software code. We evaluate it on three benchmark suites: HumanEval_CPP (competitive programming in C++), SciMark2 (scientific kernels in Java), and DaCapoBench (large-scale software systems in Java). Results show that SysLLMatic can improve system performance, including latency, throughput, energy efficiency, memory usage, and CPU utilization. It consistently outperforms state-of-the-art LLM baselines on microbenchmarks. On large-scale application codes, it surpasses traditional compiler optimizations, achieving average relative improvements of 1.85x in latency and 2.24x in throughput. Our findings demonstrate that LLMs, guided by principled systems thinking and appropriate performance diagnostics, can serve as viable software system optimizers. We further identify limitations of our approach and the challenges involved in handling complex applications. This work provides a foundation for generating optimized code across various languages, benchmarks, and program sizes in a principled manner.

Related papers

Do Large Language Models Understand Performance Optimization? [0.9320657506524149]
Large Language Models (LLMs) have emerged as powerful tools for software development tasks such as code completion, translation, and optimization.<n>This paper presents a benchmark suite encompassing multiple critical HPC computational motifs to evaluate the performance of code optimized by state-of-the-art LLMs.
arXiv Detail & Related papers (2025-03-17T23:30:23Z)
Autellix: An Efficient Serving Engine for LLM Agents as General Programs [59.673243129044465]
Large language model (LLM) applications are evolving beyond simple chatbots into dynamic, general-purpose agentic programs.<n>Existing LLM serving systems ignore dependencies between programs and calls, missing significant opportunities for optimization.<n>We introduce Autellix, an LLM serving system that treats programs as first-class citizens to minimize their end-to-end latencies.
arXiv Detail & Related papers (2025-02-19T18:59:30Z)
Tackling the Dynamicity in a Production LLM Serving System with SOTA Optimizations via Hybrid Prefill/Decode/Verify Scheduling on Efficient Meta-kernels [12.77187564450236]
We introduce XY-Serve, a versatile, Ascend native, end-to-end production large language model (LLM) serving system.<n>The core idea is an abstraction mechanism that smooths out the workload variability by decomposing computations into fine-grained meta primitives.<n>For GEMM, we introduce a virtual padding scheme that adapts to dynamic shape changes while using highly efficient GEMM primitives with assorted fixed tile sizes.
arXiv Detail & Related papers (2024-12-24T02:27:44Z)
Training of Scaffolded Language Models with Language Supervision: A Survey [62.59629932720519]
This survey organizes the literature on the design and optimization of emerging structures around post-trained LMs.<n>We refer to this overarching structure as scaffolded LMs and focus on LMs that are integrated into multi-step processes with tools.
arXiv Detail & Related papers (2024-10-21T18:06:25Z)
Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces [9.880183350366792]
A key challenge in improving parallel program performance is efficiently mapping tasks to processors and data to memory.<n>We introduce a framework that automates mapper development with generative optimization.<n>Our approach finds mappers that surpass expert-written mappers by up to 1.34X speedup across nine benchmarks.
arXiv Detail & Related papers (2024-10-21T04:08:37Z)
Should AI Optimize Your Code? A Comparative Study of Classical Optimizing Compilers Versus Current Large Language Models [0.0]
Large Language Models (LLMs) raise intriguing questions about the potential of these AI approaches to revolutionize code optimization.<n>This work aims to answer an essential question for the compiler community: "Can AI-driven models revolutionize the way we approach code optimization?"<n>We present a comparative analysis between three classical optimizing compilers and two recent large language models.
arXiv Detail & Related papers (2024-06-17T23:26:41Z)
A Problem-Oriented Perspective and Anchor Verification for Code Optimization [43.28045750932116]
Large language models (LLMs) have shown remarkable capabilities in solving various programming tasks.<n>This paper investigates the capabilities of LLMs in optimizing code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z)
CompilerDream: Learning a Compiler World Model for General Code Optimization [58.87557583347996]
We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization. It comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies. It excels across diverse datasets, surpassing LLVM's built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.
arXiv Detail & Related papers (2024-04-24T09:20:33Z)
Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models [26.2566707495948]
Large Language Models (LLMs) have seen great advance in both academia and industry. We benchmark the end-to-end performance of pre-training, fine-tuning, and serving LLMs in different sizes. Then, we dive deeper to provide a detailed runtime analysis of the sub-modules, including computing and communication operators in LLMs.
arXiv Detail & Related papers (2023-11-07T03:25:56Z)
Large Language Models as Optimizers [106.52386531624532]
We propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as prompts. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values. We demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
arXiv Detail & Related papers (2023-09-07T00:07:15Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
Learning to Superoptimize Real-world Programs [79.4140991035247]
We propose a framework to learn to superoptimize real-world programs by using neural sequence-to-sequence models. We introduce the Big Assembly benchmark, a dataset consisting of over 25K real-world functions mined from open-source projects in x86-64 assembly.
arXiv Detail & Related papers (2021-09-28T05:33:21Z)
Enabling Retargetable Optimizing Compilers for Quantum Accelerators via a Multi-Level Intermediate Representation [78.8942067357231]
We present a multi-level quantum-classical intermediate representation (IR) that enables an optimizing, retargetable, ahead-of-time compiler. We support the entire gate-based OpenQASM 3 language and provide custom extensions for common quantum programming patterns and improved syntax. Our work results in compile times that are 1000x faster than standard Pythonic approaches, and 5-10x faster than comparative standalone quantum language compilers.
arXiv Detail & Related papers (2021-09-01T17:29:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.