Related papers: Leveraging Reinforcement Learning and Large Language Models for Code Optimization

Related papers

Machine Learning Pipeline for Software Engineering: A Systematic Literature Review [0.0]
This systematic literature review examines state-of-the-art Machine Learning pipelines designed for software engineering (SE)<n>Our findings show that robust preprocessing, such as SMOTE for data balancing, improves model reliability.<n> Ensemble methods like Random Forest and Gradient Boosting dominate performance across tasks.<n>New metrics like Best Arithmetic Mean (BAM) are emerging in niche applications.
arXiv Detail & Related papers (2025-07-31T15:37:30Z)
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling [3.253908111652627]
Large Language Models (LLMs) often struggle to generate formally correct and usable models against hallucinations.<n>We present a novel framework that significantly improves the authenticity of LLMs for optimization modeling using Reinforcement Learning with Verifiable Reward.
arXiv Detail & Related papers (2025-05-17T02:32:03Z)
A Systematic Literature Review of Parameter-Efficient Fine-Tuning for Large Code Models [2.171120568435925]
Large Language Models (LLMs) for code require significant computational resources for training and fine-tuning. To address this, the research community has increasingly turned to Efficient Fine-Tuning (PEFT) PEFT enables the adaptation of large models by updating only a small subset of parameters, rather than the entire model. Our study synthesizes findings from 27 peer-reviewed papers, identifying patterns in configuration strategies and adaptation trade-offs.
arXiv Detail & Related papers (2025-04-29T16:19:25Z)
Escaping Collapse: The Strength of Weak Data for Large Language Model Training [15.77316232527746]
We develop a theoretical framework to investigate how much curation is needed in order to ensure that LLM performance continually improves. We describe a training procedure that converges to an optimal LLM even if almost all of the non-synthetic training data is of poor quality.
arXiv Detail & Related papers (2025-02-13T03:20:37Z)
Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs [58.18140409409302]
Large Language Models (LLMs) have made substantial strides in structured tasks through Reinforcement Learning (RL) Applying RL in broader domains like chatbots and content generation presents unique challenges. We show a case study of reproducing existing reward model ensemble research using embedding-based reward models.
arXiv Detail & Related papers (2025-02-04T19:37:35Z)
What Makes Large Language Models Reason in (Multi-Turn) Code Generation? [28.614888506962988]
Chain-of-thought has established itself as a popular vehicle for improving the outputs of large language models (LLMs) We investigate the effects of a wide range of prompting strategies with a focus on automatic re-prompting over multiple turns and computational requirements. Our study reveals strategies that consistently improve performance across all models with small and large sampling budgets.
arXiv Detail & Related papers (2024-10-10T16:53:10Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Achieving Peak Performance for Large Language Models: A Systematic Review [0.0]
Large language models (LLMs) have achieved remarkable success in natural language processing (NLP) As models grow into the trillion- parameter range, computational and memory costs increase significantly. This makes it difficult for many researchers to access the resources needed to train or apply these models.
arXiv Detail & Related papers (2024-09-07T13:57:41Z)
LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement [93.38736019287224]
"LLMs-as-Instructors" framework autonomously enhances the training of smaller target models. Inspired by the theory of "Learning from Errors", this framework employs an instructor LLM to meticulously analyze the specific errors within a target model. Within this framework, we implement two strategies: "Learning from Error," which focuses solely on incorrect responses to tailor training data, and "Learning from Error by Contrast", which uses contrastive learning to analyze both correct and incorrect responses for a deeper understanding of errors.
arXiv Detail & Related papers (2024-06-29T17:16:04Z)
Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks [6.596361762662328]
Internal structure and operation mechanism of large-scale language models are analyzed theoretically. We evaluate the contribution of adaptive optimization algorithms (such as AdamW), massively parallel computing techniques, and mixed precision training strategies.
arXiv Detail & Related papers (2024-05-20T00:10:00Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
Multiplicative update rules for accelerating deep learning training and increasing robustness [69.90473612073767]
We propose an optimization framework that fits to a wide range of machine learning algorithms and enables one to apply alternative update rules. We claim that the proposed framework accelerates training, while leading to more robust models in contrast to traditionally used additive update rule.
arXiv Detail & Related papers (2023-07-14T06:44:43Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles. We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates. We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z)
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning. During inference, we introduce a new generation procedure with a critical sampling strategy. For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)
RLFlow: Optimising Neural Network Subgraph Transformation with World Models [0.0]
We propose a model-based agent which learns to optimise the architecture of neural networks by performing a sequence of subgraph transformations to reduce model runtime. We show our approach can match the performance of state of the art on common convolutional networks and outperform those by up to 5% on transformer-style architectures.
arXiv Detail & Related papers (2022-05-03T11:52:54Z)
Precise Learning of Source Code Contextual Semantics via Hierarchical Dependence Structure and Graph Attention Networks [28.212889828892664]
We propose a novel source code model embedded with hierarchical dependencies. We introduce the syntactic structural of the basic block, i.e., its corresponding AST, in source code model to provide sufficient information. The results show that our model reduces the scale of parameters by 50% and achieves 4% improvement on accuracy on program classification task.
arXiv Detail & Related papers (2021-11-20T04:03:42Z)
Few-shot Quality-Diversity Optimization [50.337225556491774]
Quality-Diversity (QD) optimization has been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.
arXiv Detail & Related papers (2021-09-14T17:12:20Z)
Learning to Optimize: A Primer and A Benchmark [94.29436694770953]
Learning to optimize (L2O) is an emerging approach that leverages machine learning to develop optimization methods. This article is poised to be the first comprehensive survey and benchmark of L2O for continuous optimization.
arXiv Detail & Related papers (2021-03-23T20:46:20Z)
MLComp: A Methodology for Machine Learning-based Performance Estimation and Adaptive Selection of Pareto-Optimal Compiler Optimization Sequences [10.200899224740871]
We propose a novel Reinforcement Learning-based policy methodology for embedded software optimization. We show that different Machine Learning models are automatically tested to choose the best-fitting one. We also show that our framework can be trained efficiently for any target platform and application domain.
arXiv Detail & Related papers (2020-12-09T19:13:39Z)
A Learned Performance Model for Tensor Processing Units [5.733911161090224]
We demonstrate a method of learning performance models from a corpus of graph programs for Processing Unit (TPU) instances. We show that our learned model outperforms a heavily-optimized analytical performance model on two tasks. It helps an autotuner discover faster programs in a setting where access to TPUs is limited or expensive.
arXiv Detail & Related papers (2020-08-03T17:24:52Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.