OS-R1: Agentic Operating System Kernel Tuning with Reinforcement Learning
- URL: http://arxiv.org/abs/2508.12551v1
- Date: Mon, 18 Aug 2025 01:09:57 GMT
- Title: OS-R1: Agentic Operating System Kernel Tuning with Reinforcement Learning
- Authors: Hongyu Lin, Yuchen Li, Haoran Luo, Kaichun Yao, Libo Zhang, Mingjie Xing, Yanjun Wu,
- Abstract summary: This paper introduces OS-R1, an agentic Linux kernel tuning framework powered by rule-based reinforcement learning (RL)<n>By abstracting the kernel configuration space as an RL environment, OS-R1 facilitates efficient exploration by large language models (LLMs) and ensures accurate configuration modifications.<n> Experimental results show that OS-R1 significantly outperforms existing baseline methods, achieving up to 5.6% performance improvement over tuning and maintaining high data efficiency.
- Score: 32.81416809245337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Linux kernel tuning is essential for optimizing operating system (OS) performance. However, existing methods often face challenges in terms of efficiency, scalability, and generalization. This paper introduces OS-R1, an agentic Linux kernel tuning framework powered by rule-based reinforcement learning (RL). By abstracting the kernel configuration space as an RL environment, OS-R1 facilitates efficient exploration by large language models (LLMs) and ensures accurate configuration modifications. Additionally, custom reward functions are designed to enhance reasoning standardization, configuration modification accuracy, and system performance awareness of the LLMs. Furthermore, we propose a two-phase training process that accelerates convergence and minimizes retraining across diverse tuning scenarios. Experimental results show that OS-R1 significantly outperforms existing baseline methods, achieving up to 5.6% performance improvement over heuristic tuning and maintaining high data efficiency. Notably, OS-R1 is adaptable across various real-world applications, demonstrating its potential for practical deployment in diverse environments. Our dataset and code are publicly available at https://github.com/LHY-24/OS-R1.
Related papers
- Real-Time Lane Detection via Efficient Feature Alignment and Covariance Optimization for Low-Power Embedded Systems [22.603468261037975]
Real-time lane detection in embedded systems faces significant challenges due to subtle and sparse visual signals in RGB images.<n>We propose an innovative Covariance Distribution Optimization (CDO) module specifically designed for efficient, real-time applications.<n>CDO module aligns lane feature distributions closely with ground-truth labels, significantly enhancing detection accuracy without increasing computational complexity.
arXiv Detail & Related papers (2026-01-05T00:06:06Z) - STARK: Strategic Team of Agents for Refining Kernels [23.717055490630596]
We introduce an agentic framework for GPU kernel optimization that explores the design space through multi-agent collaboration.<n>This framework mimics the workflow of expert engineers, enabling LLMs to reason about hardware trade-offs, incorporate profiling feedback, and refine kernels iteratively.<n>We evaluate our approach on KernelBench, a benchmark for LLM-based kernel optimization, and demonstrate substantial improvements over baseline agents.
arXiv Detail & Related papers (2025-10-19T20:41:46Z) - Fun-ASR Technical Report [89.84148151617022]
We present Fun-ASR, a large-scale, LLM-based ASR system that combines massive data, large model capacity, LLM integration, and reinforcement learning.<n>Fun-ASR is specifically optimized for practical deployment, with enhancements in streaming capability, noise robustness, code-switching, hotword customization, and satisfying other real-world application requirements.<n>Thanks to production-oriented optimizations, Fun-ASR achieves state-of-the-art performance on real application datasets, demonstrating its effectiveness and robustness in practical settings.
arXiv Detail & Related papers (2025-09-15T23:19:36Z) - LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning [4.105967217565736]
We propose a dual-system LoRA framework that partitions both data and parameters by System 1 or System 2 demands.<n>Specifically, we classify task data via multi-model role-playing and voting, and partition parameters based on importance scoring.<n>Experiments show that the two-stage fine-tuning strategy, SFT and RL, lowers active parameter usage while matching or surpassing SOTA PEFT baselines.
arXiv Detail & Related papers (2025-07-28T17:11:26Z) - Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs [51.21041884010009]
Ring-lite is a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL)<n>Our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks.
arXiv Detail & Related papers (2025-06-17T17:12:34Z) - KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG [63.82127103851471]
Retrieval-Augmented Generation (RAG) enables large language models to access broader knowledge sources.<n>We demonstrate that enhancing generative models' capacity to process noisy content is equally critical for robust performance.<n>We present KARE-RAG, which improves knowledge utilization through three key innovations.
arXiv Detail & Related papers (2025-06-03T06:31:17Z) - Effective Inference-Free Retrieval for Learned Sparse Representations [19.54810957623511]
Learned Sparse Retrieval (LSR) is an effective IR approach that exploits pre-trained language models for encoding text into a learned bag of words.<n>Recently, new efficient -- inverted index-based -- retrieval engines have been proposed, leading to a natural question: has the role of regularization changed in training LSR models?<n>We show that regularization can be relaxed to produce more effective LSR encoders.
arXiv Detail & Related papers (2025-04-30T09:10:46Z) - Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning? [45.58422897857411]
This work explores the use of large language models (LLMs) for hyperparameter optimization by fine-tuning a parameter-efficient version of Code Llama using LoRA.<n>Our approach achieves competitive or superior Root Mean Square Error (RMSE) while substantially reducing computational overhead.<n>Results demonstrate that LLM-based optimization not only rivals established Bayesian methods like Tree-structured Parzen Estimators (TPE) but also accelerates tuning for real-world applications requiring perceptual quality and low-latency processing.
arXiv Detail & Related papers (2025-04-08T13:15:47Z) - SortingEnv: An Extendable RL-Environment for an Industrial Sorting Process [0.0]
We present a novel reinforcement learning (RL) environment designed to both optimize industrial sorting systems and study agent behavior in evolving spaces.<n>In simulating material flow within a sorting process our environment follows the idea of a digital twin, with operational parameters like belt speed and occupancy level.<n>It thus includes two variants: a basic version focusing on discrete belt speed adjustments and an advanced version introducing multiple sorting modes and enhanced material composition observations.
arXiv Detail & Related papers (2025-03-13T15:38:25Z) - BYOS: Knowledge-driven Large Language Models Bring Your Own Operating System More Excellent [32.81416809245337]
kernel tuning involves systematically adjusting kernel configurations to optimize system performance.<n>Despite recent advancements in large language models (LLMs), kernel tuning remains a critical challenge.<n>We propose BYOS, a framework that automates a LLM-powered framework for kernel tuning.
arXiv Detail & Related papers (2025-03-12T15:50:16Z) - Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation [46.5310645609264]
We propose a Meta-learning and Markov Chain Monte Carlo based SISR approach to learn kernel priors from organized randomness.
A lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions.
A meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer.
arXiv Detail & Related papers (2024-06-13T07:50:15Z) - RA-DIT: Retrieval-Augmented Dual Instruction Tuning [90.98423540361946]
Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores.
Existing approaches require either expensive retrieval-specific modifications to LM pre-training or use post-hoc integration of the data store that leads to suboptimal performance.
We introduce Retrieval-Augmented Dual Instruction Tuning (RA-DIT), a lightweight fine-tuning methodology that provides a third option.
arXiv Detail & Related papers (2023-10-02T17:16:26Z) - Integrate Lattice-Free MMI into End-to-End Speech Recognition [87.01137882072322]
In automatic speech recognition (ASR) research, discriminative criteria have achieved superior performance in DNN-HMM systems.
With this motivation, the adoption of discriminative criteria is promising to boost the performance of end-to-end (E2E) ASR systems.
Previous works have introduced the minimum Bayesian risk (MBR, one of the discriminative criteria) into E2E ASR systems.
In this work, novel algorithms are proposed in this work to integrate another widely used discriminative criterion, lattice-free maximum mutual information (LF-MMI) into E2E
arXiv Detail & Related papers (2022-03-29T14:32:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.