Dynamic Adaptation in Data Storage: Real-Time Machine Learning for Enhanced Prefetching
- URL: http://arxiv.org/abs/2501.14771v2
- Date: Tue, 28 Jan 2025 20:33:43 GMT
- Title: Dynamic Adaptation in Data Storage: Real-Time Machine Learning for Enhanced Prefetching
- Authors: Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao,
- Abstract summary: This study explores the application of streaming machine learning to revolutionize data prefetching within multi-tiered storage systems.
Unlike traditional batch-trained models, streaming machine learning offers adaptability, real-time insights, and computational efficiency.
- Score: 40.13303683102544
- License:
- Abstract: The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1]. This study explores the application of streaming machine learning [3] to revolutionize data prefetching within multi-tiered storage systems. Unlike traditional batch-trained models, streaming machine learning [5] offers adaptability, real-time insights, and computational efficiency, responding dynamically to workload variations. This work designs and validates an innovative framework that integrates streaming classification models for predicting file access patterns, specifically the next file offset. Leveraging comprehensive feature engineering and real-time evaluation over extensive production traces, the proposed methodology achieves substantial improvements in prediction accuracy, memory efficiency, and system adaptability. The results underscore the potential of streaming models in real-time storage management, setting a precedent for advanced caching and tiering strategies.
Related papers
- Dynamic Optimization of Storage Systems Using Reinforcement Learning Techniques [40.13303683102544]
This paper introduces RL-Storage, a reinforcement learning-based framework designed to dynamically optimize storage system configurations.
RL-Storage learns from real-time I/O patterns and predicts optimal storage parameters, such as cache size, queue depths, and readahead settings.
It achieves throughput gains of up to 2.6x and latency reductions of 43% compared to baselines.
arXiv Detail & Related papers (2024-12-29T17:41:40Z) - Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Adjusting Pretrained Backbones for Performativity [34.390793811659556]
We propose a novel technique to adjust pretrained backbones for performativity in a modular way.
We show how it leads to smaller loss along the retraining trajectory and enables us to effectively select among candidate models to anticipate performance degradations.
arXiv Detail & Related papers (2024-10-06T14:41:13Z) - Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning [0.0]
patio-temporal forecasting plays a crucial role in various sectors such as transportation systems, logistics, and supply chain management.
We introduce a hybrid approach that combines the strengths of open-source large and small-scale language models (LLMs and LMs) with traditional forecasting methods.
arXiv Detail & Related papers (2024-08-26T16:11:53Z) - Bullion: A Column Store for Machine Learning [4.096087402737292]
This paper presents Bullion, a columnar storage system tailored for machine learning workloads.
Bundy addresses the complexities of data compliance, optimize the encoding of long sequence sparse features, efficiently manages wide-table projections, introduces feature quantization in storage, and provides a comprehensive cascading encoding framework.
Preliminary experimental results and theoretical analysis demonstrate Bullion's improved ability to deliver strong performance in the face of the unique demands of machine learning workloads.
arXiv Detail & Related papers (2024-04-13T05:01:54Z) - Enhanced LFTSformer: A Novel Long-Term Financial Time Series Prediction Model Using Advanced Feature Engineering and the DS Encoder Informer Architecture [0.8532753451809455]
This study presents a groundbreaking model for forecasting long-term financial time series, termed the Enhanced LFTSformer.
The model distinguishes itself through several significant innovations.
Systematic experimentation on a range of benchmark stock market datasets demonstrates that the Enhanced LFTSformer outperforms traditional machine learning models.
arXiv Detail & Related papers (2023-10-03T08:37:21Z) - TranDRL: A Transformer-Driven Deep Reinforcement Learning Enabled Prescriptive Maintenance Framework [58.474610046294856]
Industrial systems demand reliable predictive maintenance strategies to enhance operational efficiency and reduce downtime.
This paper introduces an integrated framework that leverages the capabilities of the Transformer model-based neural networks and deep reinforcement learning (DRL) algorithms to optimize system maintenance actions.
arXiv Detail & Related papers (2023-09-29T02:27:54Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.