Seamless Data Migration between Database Schemas with DAMI-Framework: An Empirical Study on Developer Experience
- URL: http://arxiv.org/abs/2504.17662v1
- Date: Thu, 24 Apr 2025 15:30:28 GMT
- Title: Seamless Data Migration between Database Schemas with DAMI-Framework: An Empirical Study on Developer Experience
- Authors: Delfina Ramos-Vidal, Alejandro Cortiñas, Miguel R. Luaces, Oscar Pedreira, Ángeles Saavedra Places, Wesley K. G. Assunção,
- Abstract summary: Many businesses depend on legacy systems, which often use outdated technology that complicates maintenance and updates.<n>Data migration between different database schemas is an error-prone and cognitively demanding task.<n>Our objective is to alleviate developers' workloads through our DAMI-Framework.
- Score: 38.860468003121404
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Many businesses depend on legacy systems, which often use outdated technology that complicates maintenance and updates. Therefore, software modernization is essential, particularly data migration between different database schemas. Established methodologies, like model transformation and ETL tools, facilitate this migration; they require deep knowledge of database languages and both the source and target schemas. This necessity renders data migration an error-prone and cognitively demanding task. Our objective is to alleviate developers' workloads during schema evolution through our DAMI-Framework. This framework incorporates a domain-specific language (DSL) and a parser to facilitate data migration between database schemas. DAMI-DSL simplifies schema mapping while the parser automates SQL script generation. We assess developer experience in data migration by conducting an empirical evaluation with 21 developers to assess their experiences using our DSL versus traditional SQL. The study allows us to measure their perceptions of the DSL properties and user experience. The participants praised DAMI-DSL for its readability and ease of use. The findings indicate that our DSL reduces data migration efforts compared to SQL scripts.
Related papers
- SchemaAgent: A Multi-Agents Framework for Generating Relational Database Schema [35.57815867567431]
Existing efforts are mostly based on customized rules or conventional deep learning models, often producing relational schema.<n>We propose a unified LLM-based multi-agent framework for the automated generation of high-quality database schema.Agent.<n>We incorporate dedicated roles for reflection and inspection, alongside an innovative error detection and correction mechanism to identify rectify issues across various phases.
arXiv Detail & Related papers (2025-03-31T09:39:19Z) - Towards Human-Guided, Data-Centric LLM Co-Pilots [53.35493881390917]
CliMB-DC is a human-guided, data-centric framework for machine learning co-pilots.
It combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing.
We show how CliMB-DC can transform uncurated datasets into ML-ready formats.
arXiv Detail & Related papers (2025-01-17T17:51:22Z) - Relational Database Augmented Large Language Model [59.38841050766026]
Large language models (LLMs) excel in many natural language processing (NLP) tasks.
They can only incorporate new knowledge through training or supervised fine-tuning processes.
This precise, up-to-date, and private information is typically stored in relational databases.
arXiv Detail & Related papers (2024-07-21T06:19:10Z) - Example-Based Automatic Migration of Continuous Integration Systems [2.2836654317217326]
Continuous Integration (CI) is a widely adopted practice for faster code change integration and testing.
Developers often migrate between CI systems in pursuit of features like matrix building or better logging.
This migration is effort intensive and error-prone owing to limited knowledge of the new CI system and its syntax.
We propose a novel approach for CI system's automatic migration: CIMig.
arXiv Detail & Related papers (2024-07-02T20:19:21Z) - DBCopilot: Natural Language Querying over Massive Databases via Schema Routing [47.009638761948466]
We present DBCopilot, a framework that addresses challenges by employing a compact and flexible copilot model for routing over massive databases.<n>This framework utilizes a single lightweight differentiable search index to construct semantic mappings for massive database schemata, and navigates natural language questions to their target databases and tables in a relation joint retrieval manner.
arXiv Detail & Related papers (2023-12-06T12:37:28Z) - Serving Deep Learning Model in Relational Databases [70.53282490832189]
Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains.
We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks.
The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS)
arXiv Detail & Related papers (2023-10-07T06:01:35Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - AskYourDB: An end-to-end system for querying and visualizing relational
databases using natural language [0.0]
We propose a semantic parsing approach to address the challenge of converting complex natural language into SQL.
We modified state-of-the-art models, by various pre and post processing steps which make the significant part when a model is deployed in production.
To make the product serviceable to businesses we added an automatic visualization framework over the queried results.
arXiv Detail & Related papers (2022-10-16T13:31:32Z) - A domain-specific language for describing machine learning dataset [3.9576015470370893]
This DSL describes datasets in terms of their structure, data provenance, and social concerns.
It is implemented as a Visual Studio Code plugin, and it has been published under an open source license.
arXiv Detail & Related papers (2022-07-05T14:00:01Z) - A Unified Transferable Model for ML-Enhanced DBMS [53.46830627879208]
We propose a unified model MTMLF that uses a multi-task training procedure to capture the transferable knowledge across tasks and a pretrain finetune procedure to distill the meta knowledge across DBs.
We believe this paradigm is more suitable for cloud DB service, and has the potential to revolutionize the way how ML is used in the future.
arXiv Detail & Related papers (2021-05-06T03:31:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.