Sources of Underproduction in Open Source Software
- URL: http://arxiv.org/abs/2401.11281v1
- Date: Sat, 20 Jan 2024 17:21:24 GMT
- Title: Sources of Underproduction in Open Source Software
- Authors: Kaylea Champion and Benjamin Mako Hill
- Abstract summary: Open source software relies on individuals who select their own tasks.
We examine the social and technical factors associated with underproduction.
Having higher numbers of contributors is associated with higher underproduction risk.
- Score: 7.168628921229442
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Because open source software relies on individuals who select their own
tasks, it is often underproduced -- a term used by software engineering
researchers to describe when a piece of software's relative quality is lower
than its relative importance. We examine the social and technical factors
associated with underproduction through a comparison of software packaged by
the Debian GNU/Linux community. We test a series of hypotheses developed from a
reading of prior research in software engineering. Although we find that
software age and programming language age offer a partial explanation for
variation in underproduction, we were surprised to find that the association
between underproduction and package age is weaker at high levels of programming
language age. With respect to maintenance efforts, we find that additional
resources are not always tied to better outcomes. In particular, having higher
numbers of contributors is associated with higher underproduction risk. Also,
contrary to our expectations, maintainer turnover and maintenance by a declared
team are not associated with lower rates of underproduction. Finally, we find
that the people working on bugs in underproduced packages tend to be those who
are more central to the community's collaboration network structure, although
contributors' betweenness centrality (often associated with brokerage in social
networks) is not associated with underproduction.
Related papers
- The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot [4.8256226973915455]
We investigate the role of GitHub Copilot, a generative AI programmer pair, on software development in open-source community.
We find that Copilot significantly enhances project-level productivity by 6.5%.
We conclude that AI pair programmers bring benefits to developers to automate and augment their code, but human developers' knowledge of software projects can enhance the benefits.
arXiv Detail & Related papers (2024-10-02T23:26:10Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.
We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency.
We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people.
These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z) - Towards a Structural Equation Model of Open Source Blockchain Software
Health [0.0]
This work uses exploratory factor analysis to identify latent constructs that are representative of general public interest or popularity in software.
We find that interest is a combination of stars, forks, and text mentions in the GitHub repository, while a second factor for robustness is composed of a criticality score.
A structural model of software health is proposed such that general interest positively influences developer engagement, which, in turn, positively predicts software robustness.
arXiv Detail & Related papers (2023-10-31T08:47:41Z) - Embedded Software Development with Digital Twins: Specific Requirements
for Small and Medium-Sized Enterprises [55.57032418885258]
Digital twins have the potential for cost-effective software development and maintenance strategies.
We interviewed SMEs about their current development processes.
First results show that real-time requirements prevent, to date, a Software-in-the-Loop development approach.
arXiv Detail & Related papers (2023-09-17T08:56:36Z) - Collaborative, Code-Proximal Dynamic Software Visualization within Code
Editors [55.57032418885258]
This paper introduces the design and proof-of-concept implementation for a software visualization approach that can be embedded into code editors.
Our contribution differs from related work in that we use dynamic analysis of a software system's runtime behavior.
Our visualization approach enhances common remote pair programming tools and is collaboratively usable by employing shared code cities.
arXiv Detail & Related papers (2023-08-30T06:35:40Z) - Comparing Software Developers with ChatGPT: An Empirical Investigation [0.0]
This paper conducts an empirical investigation, contrasting the performance of software engineers and AI systems, like ChatGPT, across different evaluation metrics.
The paper posits that a comprehensive comparison of software engineers and AI-based solutions, considering various evaluation criteria, is pivotal in fostering human-machine collaboration.
arXiv Detail & Related papers (2023-05-19T17:25:54Z) - The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour.
This chapter explores the ecosystems of development bots and GitHub Actions.
It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z) - Big Data = Big Insights? Operationalising Brooks' Law in a Massive
GitHub Data Set [1.1470070927586014]
We study challenges that can explain the disagreement between recent studies of developer productivity in massive repository data.
We provide, to the best of our knowledge, the largest, curated corpus of GitHub projects tailored to investigate the influence of team size and collaboration patterns on individual and collective productivity.
arXiv Detail & Related papers (2022-01-12T17:25:30Z) - Underproduction: An Approach for Measuring Risk in Open Source Software [9.701036831490766]
'Underproduction' occurs when the supply of software engineering labor becomes out of alignment with the demand of people who rely on the software produced.
We present a conceptual framework for identifying relative underproduction in software as well as a statistical method for applying our framework to a comprehensive dataset.
arXiv Detail & Related papers (2021-02-27T23:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.