Related papers: Exploring Data Management Challenges and Solutions in Agile Software Development: A Literature Review and Practitioner Survey

Exploring Data Management Challenges and Solutions in Agile Software Development: A Literature Review and Practitioner Survey

URL: http://arxiv.org/abs/2402.00462v3
Date: Mon, 09 Dec 2024 07:02:20 GMT
Title: Exploring Data Management Challenges and Solutions in Agile Software Development: A Literature Review and Practitioner Survey
Authors: Ahmed Fawzy, Amjed Tahir, Matthias Galster, Peng Liang,
Abstract summary: Managing data related to a software product and its development poses significant challenges for software projects and agile development teams.<n>These include integrating data from diverse sources and ensuring data quality amidst continuous change and adaptation.<n>The paper systematically explores data management challenges and potential solutions in agile projects.
Score: 4.45543024542181
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Context: Managing data related to a software product and its development poses significant challenges for software projects and agile development teams. These include integrating data from diverse sources and ensuring data quality amidst continuous change and adaptation. Objective: The paper systematically explores data management challenges and potential solutions in agile projects, aiming to provide insights into data management challenges and solutions for both researchers and practitioners. Method: We employed a mixed-methods approach, including a systematic literature review (SLR) to understand the state-of-research followed by a survey with practitioners to reflect on the state-of-practice. The SLR reviewed 45 studies, identifying and categorizing data management aspects along with their associated challenges and solutions. The practitioner survey captured practical experiences and solutions from 32 industry practitioners who were significantly involved in data management to complement the findings from the SLR. Results: Our findings identified major data management challenges in practice, such as managing data integration processes, capturing diverse data, automating data collection, and meeting real-time analysis requirements. To address these challenges, solutions such as automation tools, decentralized data management practices, and ontology-based approaches have been identified. These solutions enhance data integration, improve data quality, and enable real-time decision-making by providing flexible frameworks tailored to agile project needs. Conclusion: The study pinpointed significant challenges and actionable solutions in data management for agile development. Our findings provide practical implications for practitioners and researchers, emphasizing the development of effective data management practices and tools to address those challenges and improve project success.

Related papers

LLM-Powered Knowledge Graphs for Enterprise Intelligence and Analytics [4.968761545765129]
This paper introduces a framework that uses large language models (LLMs) to unify various data sources into a comprehensive, activity-centric knowledge graph. The framework automates tasks such as entity extraction, relationship inference, and semantic enrichment. It supports applications such as contextual search, task prioritization, expertise discovery, personalized recommendations, and advanced analytics.
arXiv Detail & Related papers (2025-03-11T02:50:45Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
Sustainable Digitalization of Business with Multi-Agent RAG and LLM [1.6385815610837167]
This research aims to explore the integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) We propose a sustainable business solution using pre-existing LLMs that can work with diverse datasets.
arXiv Detail & Related papers (2025-01-06T08:14:23Z)
Deep Learning, Machine Learning, Advancing Big Data Analytics and Management [26.911181864764117]
Advances in artificial intelligence, machine learning, and deep learning have catalyzed the transformation of big data analytics and management. This work explores the theoretical foundations, methodological advancements, and practical implementations of these technologies. It equips researchers, practitioners, and data enthusiasts with the tools to navigate the complexities of modern data analytics.
arXiv Detail & Related papers (2024-12-03T05:59:34Z)
Deploying Large Language Models With Retrieval Augmented Generation [0.21485350418225244]
Retrieval Augmented Generation has emerged as a key approach for integrating knowledge from data sources outside of the large language model's training set. We present insights from the development and field-testing of a pilot project that integrates LLMs with RAG for information retrieval.
arXiv Detail & Related papers (2024-11-07T22:11:51Z)
A Systematic Review of NeurIPS Dataset Management Practices [7.974245534539289]
We present a systematic review of datasets published at the NeurIPS track, focusing on four key aspects: provenance, distribution, ethical disclosure, and licensing. Our findings reveal that dataset provenance is often unclear due to ambiguous filtering and curation processes. These inconsistencies underscore the urgent need for standardized data infrastructures for the publication and management of datasets.
arXiv Detail & Related papers (2024-10-31T23:55:41Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs) We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs. We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z)
AIOps Solutions for Incident Management: Technical Guidelines and A Comprehensive Literature Review [0.29998889086656577]
This study proposes an AIOps terminology and taxonomy, establishing a structured incident management procedure and providing guidelines for constructing an AIOps framework. The goal is to provide a comprehensive review of technical and research aspects in AIOps for incident management, aiming to structure knowledge, identify gaps, and establish a foundation for future developments in the field.
arXiv Detail & Related papers (2024-04-01T17:32:22Z)
An Empirical Study of Challenges in Machine Learning Asset Management [15.07444988262748]
Despite existing research, a significant knowledge gap remains in operational challenges like model versioning, data traceability, and collaboration. Our study aims to address this gap by analyzing 15,065 posts from developer forums and platforms. We uncover 133 topics related to asset management challenges, grouped into 16 macro-topics, with software dependency, model deployment, and model training being the most discussed.
arXiv Detail & Related papers (2024-02-25T05:05:52Z)
Data Management For Training Large Language Models: A Survey [64.18200694790787]
Data plays a fundamental role in training Large Language Models (LLMs) This survey aims to provide a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs.
arXiv Detail & Related papers (2023-12-04T07:42:16Z)
Data Acquisition: A New Frontier in Data-centric AI [65.90972015426274]
We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets. We then introduce the DAM challenge, a benchmark to model the interaction between the data providers and acquirers. Our evaluation of the submitted strategies underlines the need for effective data acquisition strategies in Machine Learning.
arXiv Detail & Related papers (2023-11-22T22:15:17Z)
Nemo: Guiding and Contextualizing Weak Supervision for Interactive Data Programming [77.38174112525168]
We present Nemo, an end-to-end interactive Supervision system that improves overall productivity of WS learning pipeline by an average 20% (and up to 47% in one task) compared to the prevailing WS supervision approach.
arXiv Detail & Related papers (2022-03-02T19:57:32Z)
A Field Guide to Federated Optimization [161.3779046812383]
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data. This paper provides recommendations and guidelines on formulating, designing, evaluating and analyzing federated optimization algorithms.
arXiv Detail & Related papers (2021-07-14T18:09:08Z)
Scaling up Search Engine Audits: Practical Insights for Algorithm Auditing [68.8204255655161]
We set up experiments for eight search engines with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time.
arXiv Detail & Related papers (2021-06-10T15:49:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.