How Do Communities of ML-Enabled Systems Smell? A Cross-Sectional Study on the Prevalence of Community Smells
- URL: http://arxiv.org/abs/2504.17419v1
- Date: Thu, 24 Apr 2025 10:23:37 GMT
- Title: How Do Communities of ML-Enabled Systems Smell? A Cross-Sectional Study on the Prevalence of Community Smells
- Authors: Giusy Annunziata, Stefano Lambiase, Fabio Palomba, Gemma Catolino, Filomena Ferrucci,
- Abstract summary: We conducted an empirical study on 188 repositories from the NICHE dataset using the CADOCS tool to identify and analyze community smells.<n>We found that certain smells, such as Prima Donna Effects and Sharing Villainy, are more prevalent and fluctuate over time compared to others like Radio Silence or Organizational Skirmish.
- Score: 13.840177755312665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective software development relies on managing both collaboration and technology, but sociotechnical challenges can harm team dynamics and increase technical debt. Although teams working on ML enabled systems are interdisciplinary, research has largely focused on technical issues, leaving their socio-technical dynamics underexplored. This study aims to address this gap by examining the prevalence, evolution, and interrelations of community smells, in open-source ML projects. We conducted an empirical study on 188 repositories from the NICHE dataset using the CADOCS tool to identify and analyze community smells. Our analysis focused on their prevalence, interrelations, and temporal variations. We found that certain smells, such as Prima Donna Effects and Sharing Villainy, are more prevalent and fluctuate over time compared to others like Radio Silence or Organizational Skirmish. These insights might provide valuable support for ML project managers in addressing socio-technical issues and improving team coordination.
Related papers
- I Felt Pressured to Give 100% All the Time: How Are Neurodivergent Professionals Being Included in Software Development Teams? [0.46873264197900916]
This study seeks to understand the work experiences of neurodivergent professionals acting in different software development roles.<n>We applied the Sociotechnical Theory (STS) to investigate how the social structures of organizations and their respective work technologies influence the inclusion of these professionals.
arXiv Detail & Related papers (2025-03-12T02:28:59Z) - Towards Human-Guided, Data-Centric LLM Co-Pilots [53.35493881390917]
CliMB-DC is a human-guided, data-centric framework for machine learning co-pilots.<n>It combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing.<n>We show how CliMB-DC can transform uncurated datasets into ML-ready formats.
arXiv Detail & Related papers (2025-01-17T17:51:22Z) - Private Knowledge Sharing in Distributed Learning: A Survey [50.51431815732716]
The rise of Artificial Intelligence has revolutionized numerous industries and transformed the way society operates.
It is crucial to utilize information in learning processes that are either distributed or owned by different entities.
Modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes.
arXiv Detail & Related papers (2024-02-08T07:18:23Z) - On the Interaction between Software Engineers and Data Scientists when
building Machine Learning-Enabled Systems [1.2184324428571227]
Machine Learning (ML) components have been increasingly integrated into the core systems of organizations.
One of the key challenges is the effective interaction between actors with different backgrounds who need to work closely together.
This paper presents an exploratory case study to understand the current interaction and collaboration dynamics between these roles in ML projects.
arXiv Detail & Related papers (2024-02-08T00:27:56Z) - Locating Community Smells in Software Development Processes Using
Higher-Order Network Centralities [38.72139150402261]
Community smells are negative patterns in software development teams' interactions that impede their ability to create software.
Current approaches aim to detect community smells by analysing static network representations of software teams' interaction structures.
We show that higher-order network models provide a robust means of revealing such hidden patterns and complex relationships.
arXiv Detail & Related papers (2023-09-14T06:48:15Z) - Building Cooperative Embodied Agents Modularly with Large Language
Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z) - Machine Learning Practices Outside Big Tech: How Resource Constraints
Challenge Responsible Development [1.8275108630751844]
Machine learning practitioners from diverse occupations and backgrounds are increasingly using machine learning (ML) methods.
Past research often excludes the broader, lesser-resourced ML community.
These practitioners share many of the same ML development difficulties and ethical conundrums as their Big Tech counterparts.
arXiv Detail & Related papers (2021-10-06T17:25:21Z) - Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z) - On managing vulnerabilities in AI/ML systems [1.6058099298620423]
This paper explores how the current paradigm of vulnerability management might adapt to include machine learning systems.
The hypothetical scenario is structured around exploring the changes to the six areas of vulnerability management: discovery, report intake, analysis, coordination, disclosure, and response.
arXiv Detail & Related papers (2021-01-22T21:59:44Z) - Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper [50.25428141435537]
Artificial Intelligence for IT Operations (AIOps) is an emerging interdisciplinary field arising in the intersection between machine learning, big data, streaming analytics, and the management of IT operations.
Main aim of the AIOPS workshop is to bring together researchers from both academia and industry to present their experiences, results, and work in progress in this field.
arXiv Detail & Related papers (2021-01-15T10:43:10Z) - Learnings from Frontier Development Lab and SpaceML -- AI Accelerators
for NASA and ESA [57.06643156253045]
Research with AI and ML technologies lives in a variety of settings with often asynchronous goals and timelines.
We perform a case study of the Frontier Development Lab (FDL), an AI accelerator under a public-private partnership from NASA and ESA.
FDL research follows principled practices that are grounded in responsible development, conduct, and dissemination of AI research.
arXiv Detail & Related papers (2020-11-09T21:23:03Z) - A game-theoretic analysis of networked system control for common-pool
resource management using multi-agent reinforcement learning [54.55119659523629]
Multi-agent reinforcement learning has recently shown great promise as an approach to networked system control.
Common-pool resources include arable land, fresh water, wetlands, wildlife, fish stock, forests and the atmosphere.
arXiv Detail & Related papers (2020-10-15T14:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.