Related papers: Organizational Artifacts of Code Development

Organizational Artifacts of Code Development

URL: http://arxiv.org/abs/2105.14637v1
Date: Sun, 30 May 2021 22:04:09 GMT
Title: Organizational Artifacts of Code Development
Authors: Parisa Kaghazgaran, Nichola Lubold, Fred Morstatter
Abstract summary: We study social effects of country by measuring differences in software repositories associated with different countries. We propose a novel approach of modeling repositories based on their sequence of development activities as a sequence embedding task. We conduct a case study on repos from well-known corporations and find that country can describe the differences in development better than the company affiliation itself.
Score: 10.863006516392831
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Software is the outcome of active and effective communication between members of an organization. This has been noted with Conway's law, which states that ``organizations design systems that mirror their own communication structure.'' However, software developers are often members of multiple organizational groups (e.g., corporate, regional,) and it is unclear how association with groups beyond one's company influence the development process. In this paper, we study social effects of country by measuring differences in software repositories associated with different countries. Using a novel dataset we obtain from GitHub, we identify key properties that differentiate software repositories based upon the country of the developers. We propose a novel approach of modeling repositories based on their sequence of development activities as a sequence embedding task and coupled with repo profile features we achieve 79.2% accuracy in identifying the country of a repository. Finally, we conduct a case study on repos from well-known corporations and find that country can describe the differences in development better than the company affiliation itself. These results have larger implications for software development and indicate the importance of considering the multiple groups developers are associated with when considering the formation and structure of teams.

Related papers

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents [59.825725526176655]
Large Language Models (LLMs) have shown remarkable capabilities as autonomous agents. Existing benchmarks either focus on single-agent tasks or are confined to narrow domains, failing to capture the dynamics of multi-agent coordination and competition. We introduce MultiAgentBench, a benchmark designed to evaluate LLM-based multi-agent systems across diverse, interactive scenarios.
arXiv Detail & Related papers (2025-03-03T05:18:50Z)
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework. Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z)
Code Ownership: The Principles, Differences, and Their Associations with Software Quality [6.123324869194196]
We investigate the differences in the commonly used ownership approximations in terms of the set of developers, the approximated code ownership values, and the expertise level. We find that commit-based and line-based ownership approximations produce different sets of developers, different code ownership values, and different sets of major developers.
arXiv Detail & Related papers (2024-08-23T03:01:59Z)
Multi-Agent Software Development through Cross-Team Collaboration [30.88149502999973]
We introduce Cross-Team Collaboration (CTC), a scalable multi-team framework for software development. CTC enables orchestrated teams to jointly propose various decisions and communicate with their insights. Results show a notable increase in quality compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-06-13T10:18:36Z)
How to Understand Whole Software Repository? [64.19431011897515]
An excellent understanding of the whole repository will be the critical path to Automatic Software Engineering (ASE) We develop a novel method named RepoUnderstander by guiding agents to comprehensively understand the whole repositories. To better utilize the repository-level knowledge, we guide the agents to summarize, analyze, and plan.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
Governing the Commons: Code Ownership and Code-Clones in Large-Scale Software Development [6.249768559720122]
In software development organizations employing weak or collective ownership, different teams are allowed and expected to autonomously perform changes in various components. Our objective is to understand how and why different teams introduce technical debt in the form of code clones as they change different components.
arXiv Detail & Related papers (2024-05-24T18:23:51Z)
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing. As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework. This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z)
The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour. This chapter explores the ecosystems of development bots and GitHub Actions. It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z)
Detecting and Optimising Team Interactions in Software Development [58.720142291102135]
This paper presents a data-driven approach to detect the functional interaction structure for software development teams. Our approach considers differences in the activity levels of team members and uses a block-constrained configuration model. We show how our approach enables teams to compare their functional interaction structure against synthetically created benchmark scenarios.
arXiv Detail & Related papers (2023-02-28T14:53:29Z)
Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set [1.1470070927586014]
We study challenges that can explain the disagreement between recent studies of developer productivity in massive repository data. We provide, to the best of our knowledge, the largest, curated corpus of GitHub projects tailored to investigate the influence of team size and collaboration patterns on individual and collective productivity.
arXiv Detail & Related papers (2022-01-12T17:25:30Z)
S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning. It is based on a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
ConE: A Concurrent Edit Detection Tool for Large ScaleSoftware Development [16.11297015618479]
ConE proactively detects concurrent edits to help mitigate the problems caused by them. We present the results of ConE's deployment through early intervention techniques such as pull request notifications.
arXiv Detail & Related papers (2021-01-16T22:55:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.