Related papers: Intent-Driven Storage Systems: From Low-Level Tuning to High-Level Understanding

Intent-Driven Storage Systems: From Low-Level Tuning to High-Level Understanding

URL: http://arxiv.org/abs/2510.15917v1
Date: Mon, 29 Sep 2025 12:07:40 GMT
Title: Intent-Driven Storage Systems: From Low-Level Tuning to High-Level Understanding
Authors: Shai Bergman, Won Wook Song, Lukas Cavigelli, Konstantin Berestizshevsky, Ke Zhou, Ji Zhang,
Abstract summary: Existing storage systems lack visibility into workload intent, limiting their ability to adapt to modern, large-scale applications.<n>We propose Intent-Driven Storage Systems (IDSS), a vision for a new paradigm where large language models (LLMs) infer workload and system intent from unstructured signals.<n>IDSS provides holistic reasoning for competing demands, synthesizing safe and efficient decisions within policy guardrails.
Score: 9.203282718014021
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Existing storage systems lack visibility into workload intent, limiting their ability to adapt to the semantics of modern, large-scale data-intensive applications. This disconnect leads to brittle heuristics and fragmented, siloed optimizations. To address these limitations, we propose Intent-Driven Storage Systems (IDSS), a vision for a new paradigm where large language models (LLMs) infer workload and system intent from unstructured signals to guide adaptive and cross-layer parameter reconfiguration. IDSS provides holistic reasoning for competing demands, synthesizing safe and efficient decisions within policy guardrails. We present four design principles for integrating LLMs into storage control loops and propose a corresponding system architecture. Initial results on FileBench workloads show that IDSS can improve IOPS by up to 2.45X by interpreting intent and generating actionable configurations for storage components such as caching and prefetching. These findings suggest that, when constrained by guardrails and embedded within structured workflows, LLMs can function as high-level semantic optimizers, bridging the gap between application goals and low-level system control. IDSS points toward a future in which storage systems are increasingly adaptive, autonomous, and aligned with dynamic workload demands.

Related papers

Choosing How to Remember: Adaptive Memory Structures for LLM Agents [43.27579458682491]
Memory is critical for enabling large language model (LLM) based agents to maintain coherent behavior over long-horizon interactions.<n>We propose a unified framework, FluxMem, that enables adaptive memory organization for LLM agents.<n> Experiments on two long-horizon benchmarks, PERSONAMEM and LoCoMo, demonstrate that our method achieves average improvements of 9.18% and 6.14%.
arXiv Detail & Related papers (2026-02-15T07:56:24Z)
ChunkWise LoRA: Adaptive Sequence Partitioning for Memory-Efficient Low-Rank Adaptation and Accelerated LLM Inference [0.21064685964744576]
ChunkWise LoRA partitions sequences into variable-length chunks based on token complexity and assigns each chunk a tailored low-rank configuration.<n>Experiments on benchmark datasets such as Wikitext-103 and SQuAD demonstrate that ChunkWise LoRA achieves up to 34% lower latency and 38% memory reduction.
arXiv Detail & Related papers (2026-01-28T22:58:28Z)
Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents [57.38404718635204]
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
arXiv Detail & Related papers (2026-01-05T08:24:16Z)
MemLoRA: Distilling Expert Adapters for On-Device Memory Systems [71.32550994522738]
Memory-augmented Large Language Models (LLMs) have demonstrated remarkable consistency during dialogues.<n>MemLoRA is a novel memory system that integrates small Vision-Language Models.<n>VLM-integrated MemLoRA-V shows massive improvements in caption-based approaches.
arXiv Detail & Related papers (2025-12-04T12:56:30Z)
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension [55.29309306566238]
Current Large Language Models (LLMs) are confronted with overwhelming information volume when comprehending long-form documents.<n>This challenge raises the imperative of a cohesive memory module, which can elevate vanilla LLMs into autonomous reading agents.<n>We draw inspiration from Jean Piaget's Constructivist Theory, illuminating three traits of the agentic memory -- structured schemata, flexible assimilation, and dynamic accommodation.
arXiv Detail & Related papers (2025-10-07T02:16:30Z)
MemOS: A Memory OS for AI System [116.87568350346537]
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI)<n>Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods.<n>MemOS is a memory operating system that treats memory as a manageable system resource.
arXiv Detail & Related papers (2025-07-04T17:21:46Z)
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z)
Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving [22.66354939370058]
Apt-Serve is a framework designed to enhance effective throughput in large language model (LLM) inference serving systems.<n>A new hybrid cache scheme combines KV cache with a memory-efficient hidden cache for reusable input hidden state vectors, allowing large batch sizes and improving request.<n>We show that Apt-Serve achieves up to 8.8x improvement in effective throughput compared to the state-of-the-art inference serving systems.
arXiv Detail & Related papers (2025-04-10T06:51:23Z)
AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding [12.106234303559571]
We present AdaServe, the first serving system designed to support efficient multi-SLO serving through SLO-customized speculative decoding.<n>AdaServe formulates multi-SLO serving as a constrained optimization problem and introduces a hardware-aware algorithm.<n>It features a speculate-select-verify pipeline that enables fine-grained control over decoding speed while maximizing system throughput.
arXiv Detail & Related papers (2025-01-21T14:15:01Z)
Dynamic Optimization of Storage Systems Using Reinforcement Learning Techniques [32.74305371783441]
This paper introduces RL-Storage, a novel reinforcement learning (RL)-based framework designed to dynamically optimize storage system configurations.<n>By autonomously adapting to workload variations in real time, RL-Storage provides a robust and scalable solution for optimizing storage performance.
arXiv Detail & Related papers (2024-12-29T17:41:40Z)
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models. Our approach employs activation sparsity to extract experts. Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z)
LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation [52.55639178180821]
The study on multi-scenario recommendation (MSR) has attracted much attention, which uses the data from all scenarios to simultaneously improve their recommendation performance.<n>Existing methods tend to integrate insufficient scenario knowledge and neglect learning personalized cross-scenario preferences, thus leading to sub-optimal performance.<n>We propose a large language model (LLM)-enhanced paradigm LLM4MSR to fill these gaps.
arXiv Detail & Related papers (2024-06-18T11:59:36Z)
KML: Using Machine Learning to Improve Storage Systems [0.2810625954925814]
Machine learning techniques promise to learn patterns, generalize from them, and enable optimal solutions. We develop a prototype KML architecture and apply it to two problems: optimal read and read-size values. Experiments show that KML consumes little OS resources, adds negligible latency, and yet can learn patterns that can improve I/O throughput by as much as 2.3x or 15x.
arXiv Detail & Related papers (2021-11-22T21:59:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.