A Unified Transferable Model for ML-Enhanced DBMS
- URL: http://arxiv.org/abs/2105.02418v1
- Date: Thu, 6 May 2021 03:31:32 GMT
- Title: A Unified Transferable Model for ML-Enhanced DBMS
- Authors: Ziniu Wu, Peilun Yang, Pei Yu, Rong Zhu, Yuxing Han, Yaliang Li, Defu
Lian, Kai Zeng, Jingren Zhou
- Abstract summary: We propose a unified model MTMLF that uses a multi-task training procedure to capture the transferable knowledge across tasks and a pretrain finetune procedure to distill the meta knowledge across DBs.
We believe this paradigm is more suitable for cloud DB service, and has the potential to revolutionize the way how ML is used in the future.
- Score: 53.46830627879208
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the database management system (DBMS) community has witnessed the
power of machine learning (ML) solutions for DBMS tasks. Despite their
promising performance, these existing solutions can hardly be considered
satisfactory. First, these ML-based methods in DBMS are not effective enough
because they are optimized on each specific task, and cannot explore or
understand the intrinsic connections between tasks. Second, the training
process has serious limitations that hinder their practicality, because they
need to retrain the entire model from scratch for a new DB. Moreover, for each
retraining, they require an excessive amount of training data, which is very
expensive to acquire and unavailable for a new DB. We propose to explore the
transferabilities of the ML methods both across tasks and across DBs to tackle
these fundamental drawbacks.
In this paper, we propose a unified model MTMLF that uses a multi-task
training procedure to capture the transferable knowledge across tasks and a
pretrain finetune procedure to distill the transferable meta knowledge across
DBs. We believe this paradigm is more suitable for cloud DB service, and has
the potential to revolutionize the way how ML is used in DBMS. Furthermore, to
demonstrate the predicting power and viability of MTMLF, we provide a concrete
and very promising case study on query optimization tasks. Last but not least,
we discuss several concrete research opportunities along this line of work.
Related papers
- Is Large Language Model Good at Database Knob Tuning? A Comprehensive Experimental Evaluation [28.753219581544617]
This study harnesses large language models (LLMs) as experienced DBAs for knob-tuning tasks with carefully designed prompts.
We conduct experiments to compare LLM-driven approaches against traditional methods across the subtasks.
Our findings reveal that LLMs not only match or surpass traditional methods but also exhibit notable interpretability.
arXiv Detail & Related papers (2024-08-05T03:26:01Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - UniDM: A Unified Framework for Data Manipulation with Large Language Models [66.61466011795798]
Large Language Models (LLMs) resolve multiple data manipulation tasks.
LLMs exhibit bright benefits in terms of performance but still require customized designs to fit each specific task.
We propose UniDM, a unified framework which establishes a new paradigm to process data manipulation tasks.
arXiv Detail & Related papers (2024-05-10T14:44:04Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - A Unified and Efficient Coordinating Framework for Autonomous DBMS
Tuning [34.85351481228439]
We propose a unified coordinating framework to efficiently utilize existing ML-based agents.
We show that it can effectively utilize different ML-based agents and find better configurations with 1.414.1X speedups on the workload execution time.
arXiv Detail & Related papers (2023-03-10T05:27:23Z) - Improving Multi-task Learning via Seeking Task-based Flat Regions [38.28600737969538]
Multi-Task Learning (MTL) is a powerful learning paradigm for training deep neural networks that allows learning more than one objective by a single backbone.
There is an emerging line of work in MTL that focuses on manipulating the task gradient to derive an ultimate gradient descent direction.
We propose to leverage a recently introduced training method, named Sharpness-aware Minimization, which can enhance model generalization ability on single-task learning.
arXiv Detail & Related papers (2022-11-24T17:19:30Z) - Improving Meta-learning for Low-resource Text Classification and
Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation.
A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.