Exploring Data Management Challenges and Solutions in Agile Software Development: A Literature Review and Practitioner Survey
- URL: http://arxiv.org/abs/2402.00462v2
- Date: Fri, 12 Jul 2024 15:33:59 GMT
- Title: Exploring Data Management Challenges and Solutions in Agile Software Development: A Literature Review and Practitioner Survey
- Authors: Ahmed Fawzy, Amjed Tahir, Matthias Galster, Peng Liang,
- Abstract summary: Managing data related to a software product and its development poses significant challenges for software projects and agile development teams.
Challenges include integrating data from diverse sources and ensuring data quality in light of continuous change and adaptation.
- Score: 4.45543024542181
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Managing data related to a software product and its development poses significant challenges for software projects and agile development teams. Challenges include integrating data from diverse sources and ensuring data quality in light of continuous change and adaptation. To this end, we aimed to systematically explore data management challenges and potential solutions in agile projects. We employed a mixed-methods approach, utilizing a systematic literature review (SLR) to understand the state-of-research followed by a survey with practitioners to reflect on the state-of-practice. In the SLR, we reviewed 45 studies in which we identified and categorized data management aspects and the associated challenges and solutions. In the practitioner survey, we captured practical experiences and solutions from 32 industry experts to complement the findings from the SLR. Our findings reveal major data management challenges reported in both the SLR and practitioner survey, such as managing data integration processes, capturing diverse data, automating data collection, and meeting real-time analysis requirements. Based on our findings, we present implications for practitioners and researchers, which include the necessity of developing clear data management policies, training on data management tools, and adopting new data management strategies that enhance agility, improve product quality, and facilitate better project outcomes.
Related papers
- Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems [10.71630696651595]
Compound AI systems (CASs) that employ LLMs as agents to accomplish knowledge-intensive tasks have garnered significant interest within database and AI communities.
silos of multimodal data sources make it difficult to identify appropriate data sources for accomplishing the task at hand.
We propose CMDBench, a benchmark modeling the complexity of enterprise data platforms.
arXiv Detail & Related papers (2024-06-02T01:10:41Z) - AIOps Solutions for Incident Management: Technical Guidelines and A Comprehensive Literature Review [0.29998889086656577]
This study proposes an AIOps terminology and taxonomy, establishing a structured incident management procedure and providing guidelines for constructing an AIOps framework.
The goal is to provide a comprehensive review of technical and research aspects in AIOps for incident management, aiming to structure knowledge, identify gaps, and establish a foundation for future developments in the field.
arXiv Detail & Related papers (2024-04-01T17:32:22Z) - An Empirical Study of Challenges in Machine Learning Asset Management [15.07444988262748]
Despite existing research, a significant knowledge gap remains in operational challenges like model versioning, data traceability, and collaboration.
Our study aims to address this gap by analyzing 15,065 posts from developer forums and platforms.
We uncover 133 topics related to asset management challenges, grouped into 16 macro-topics, with software dependency, model deployment, and model training being the most discussed.
arXiv Detail & Related papers (2024-02-25T05:05:52Z) - Collaborative business intelligence virtual assistant [1.9953434933575993]
This study focuses on the applications of data mining within distributed virtual teams through the interaction of users and a CBI Virtual Assistant.
The proposed virtual assistant for CBI endeavors to enhance data exploration accessibility for a wider range of users and streamline the time and effort required for data analysis.
arXiv Detail & Related papers (2023-12-20T05:34:12Z) - Data Management For Large Language Models: A Survey [66.59562797566163]
Data plays a fundamental role in the training of Large Language Models (LLMs)
This survey provides a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs.
arXiv Detail & Related papers (2023-12-04T07:42:16Z) - Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets.
We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers.
Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z) - Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications.
We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
arXiv Detail & Related papers (2023-08-21T15:35:16Z) - Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data
Programming [77.38174112525168]
We present Nemo, an end-to-end interactive Supervision system that improves overall productivity of WS learning pipeline by an average 20% (and up to 47% in one task) compared to the prevailing WS supervision approach.
arXiv Detail & Related papers (2022-03-02T19:57:32Z) - Scaling up Search Engine Audits: Practical Insights for Algorithm
Auditing [68.8204255655161]
We set up experiments for eight search engines with hundreds of virtual agents placed in different regions.
We demonstrate the successful performance of our research infrastructure across multiple data collections.
We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time.
arXiv Detail & Related papers (2021-06-10T15:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.