Related papers: ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

URL: http://arxiv.org/abs/2308.13641v1
Date: Fri, 25 Aug 2023 19:20:28 GMT
Title: ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges
Authors: Tarique Siddiqui, Wentao Wu
Abstract summary: Scale and complexity of workloads in modern cloud services have brought into sharper focus a critical challenge in automated index tuning. This paper directs attention to these challenges within automated index tuning and explores ways in which machine learning (ML) techniques provide new opportunities in their mitigation.
Score: 5.675806178685878
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The scale and complexity of workloads in modern cloud services have brought into sharper focus a critical challenge in automated index tuning -- the need to recommend high-quality indexes while maintaining index tuning scalability. This challenge is further compounded by the requirement for automated index implementations to introduce minimal query performance regressions in production deployments, representing a significant barrier to achieving scalability and full automation. This paper directs attention to these challenges within automated index tuning and explores ways in which machine learning (ML) techniques provide new opportunities in their mitigation. In particular, we reflect on recent efforts in developing ML techniques for workload selection, candidate index filtering, speeding up index configuration search, reducing the amount of query optimizer calls, and lowering the chances of performance regressions. We highlight the key takeaways from these efforts and underline the gaps that need to be closed for their effective functioning within the traditional index tuning framework. Additionally, we present a preliminary cross-platform design aimed at democratizing index tuning across multiple SQL-like systems -- an imperative in today's continuously expanding data system landscape. We believe our findings will help provide context and impetus to the research and development efforts in automated index tuning.

Related papers

AutoIndexer: A Reinforcement Learning-Enhanced Index Advisor Towards Scaling Workloads [0.46040036610482665]
AutoIndexer is a framework that combines workload compression, query optimization, and specialized RL models to scale index selection effectively.<n>It substantially lowers search complexity without sacrificing much index quality.<n>On average, it outperforms state-of-the-art RL-based index advisors by approximately 20% in workload cost savings.
arXiv Detail & Related papers (2025-07-30T20:38:13Z)
LOP: Learning Optimal Pruning for Efficient On-Demand MLLMs Scaling [52.1366057696919]
LOP is an efficient neural pruning framework that learns optimal pruning strategies from the target pruning constraint.<n>LOP approach trains autoregressive neural networks (NNs) to directly predict layer-wise pruning strategies adaptive to the target pruning constraint.<n> Experimental results show that LOP outperforms state-of-the-art pruning methods in various metrics while achieving up to three orders of magnitude speedup.
arXiv Detail & Related papers (2025-06-15T12:14:16Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity. This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics. Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z)
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development. We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents. We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z)
Automatic Prompt Optimization via Heuristic Search: A Survey [13.332569343755075]
Large Language Models have led to remarkable achievements across a variety of Natural Language Processing tasks. While manual methods can be effective, they typically rely on intuition and do not automatically refine prompts over time. automatic prompt optimization employing-based search algorithms can systematically explore and improve prompts with minimal human oversight.
arXiv Detail & Related papers (2025-02-26T01:42:08Z)
A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning Enhanced Approach [6.454589614577438]
This paper introduces LITune, a novel framework for end-to-end automatic tuning of Learned Index Structures. LITune employs an adaptive training pipeline equipped with a tailor-made Deep Reinforcement Learning (DRL) approach to ensure stable and efficient tuning. Our experimental results demonstrate that LITune achieves up to a 98% reduction in runtime and a 17-fold increase in throughput.
arXiv Detail & Related papers (2025-02-07T15:22:15Z)
Real-time Indexing for Large-scale Recommendation by Streaming Vector Quantization Retriever [17.156348053402766]
Streaming Vector Quantization model is a new generation of retrieval paradigm. Streaming VQ attaches items with indexes in real time, granting it immediacy. As a lightweight and implementation-friendly architecture, streaming VQ has been deployed and replaced all major retrievers in Douyin and Douyin Lite.
arXiv Detail & Related papers (2025-01-15T10:09:15Z)
UpLIF: An Updatable Self-Tuning Learned Index Framework [4.077820670802213]
UpLIF is an adaptive self-tuning learned index that adjusts the model to accommodate incoming updates. We also introduce the concept of balanced model adjustment, which determines the model's inherent properties.
arXiv Detail & Related papers (2024-08-07T22:30:43Z)
State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era [59.279784235147254]
This survey provides an in-depth summary of the latest approaches that are based on recurrent models for sequential data processing. The emerging picture suggests that there is room for thinking of novel routes, constituted by learning algorithms which depart from the standard Backpropagation Through Time.
arXiv Detail & Related papers (2024-06-13T12:51:22Z)
Accelerating String-Key Learned Index Structures via Memoization-based Incremental Training [16.93830041971135]
Learned indexes use machine learning models to learn the mappings between keys and their corresponding positions in key-value indexes. They require frequent retrainings of their models to incorporate the changes introduced by update queries. We develop an algorithm- hardware co-designed string-key learned index system, dubbed SIA.
arXiv Detail & Related papers (2024-03-18T04:44:00Z)
Quantum Algorithm Exploration using Application-Oriented Performance Benchmarks [0.0]
The QED-C suite of Application-Oriented Benchmarks provides the ability to gauge performance characteristics of quantum computers. We investigate challenges in broadening the relevance of this benchmarking methodology to applications of greater complexity.
arXiv Detail & Related papers (2024-02-14T06:55:50Z)
Efficient Architecture Search via Bi-level Data Pruning [70.29970746807882]
This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization. We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric. Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
arXiv Detail & Related papers (2023-12-21T02:48:44Z)
AutoML for Large Capacity Modeling of Meta's Ranking Systems [29.717756064694278]
We present a sampling-based AutoML method for building large capacity models. We show that our method achieves outstanding Return on Investment (ROI) versus human tuned baselines. The proposed AutoML method has already made real-world impact where a discovered Instagram CTR model with up to -0.36% NE gain was selected for large-scale online A/B test and show statistically significant gain.
arXiv Detail & Related papers (2023-11-14T03:00:50Z)
Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data [10.740391800262685]
Feature preprocessing is a crucial step to ensure good model quality. Due to the large search space, a brute-force solution is prohibitively expensive. We extend a variety of HPO and NAS algorithms to solve the Auto-FP problem.
arXiv Detail & Related papers (2023-10-04T02:46:44Z)
Towards General and Efficient Online Tuning for Spark [55.30868031221838]
We present a general and efficient Spark tuning framework that can deal with the three issues simultaneously. We have implemented this framework as an independent cloud service, and applied it to the data platform in Tencent.
arXiv Detail & Related papers (2023-09-05T02:16:45Z)
How Does Generative Retrieval Scale to Millions of Passages? [68.98628807288972]
We conduct the first empirical study of generative retrieval techniques across various corpus scales. We scale generative retrieval to millions of passages with a corpus of 8.8M passages and evaluating model sizes up to 11B parameters. While generative retrieval is competitive with state-of-the-art dual encoders on small corpora, scaling to millions of passages remains an important and unsolved challenge.
arXiv Detail & Related papers (2023-05-19T17:33:38Z)
Accelerating Deep Learning Classification with Error-controlled Approximate-key Caching [72.50506500576746]
We propose a novel caching paradigm, that we named approximate-key caching. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching.
arXiv Detail & Related papers (2021-12-13T13:49:11Z)
Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time. The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.