Trusting code in the wild: A social network-based centrality rating for
developers in the Rust ecosystem
- URL: http://arxiv.org/abs/2306.00240v1
- Date: Wed, 31 May 2023 23:24:03 GMT
- Title: Trusting code in the wild: A social network-based centrality rating for
developers in the Rust ecosystem
- Authors: Nasif Imtiaz, Preya Shabrina, Laurie Williams
- Abstract summary: This study builds a social network of 6,949 developers across the collaboration activity from 1,644 Rust packages.
We evaluate if code coming from a developer with a higher centrality rating is likely to be accepted with lesser scrutiny by the downstream projects.
- Score: 1.3581810800092387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As modern software extensively uses open source packages, developers
regularly pull in new upstream code through frequent updates. While a manual
review of all upstream changes may not be practical, developers may rely on the
authors' and reviewers' identities, among other factors, to decide what level
of review the new code may require. The goal of this study is to help
downstream project developers prioritize review efforts for upstream code by
providing a social network-based centrality rating for the authors and
reviewers of that code. To that end, we build a social network of 6,949
developers across the collaboration activity from 1,644 Rust packages. Further,
we survey the developers in the network to evaluate if code coming from a
developer with a higher centrality rating is likely to be accepted with lesser
scrutiny by the downstream projects and, therefore, is perceived to be more
trusted. Our results show that 97.7\% of the developers from the studied
packages are interconnected via collaboration, with each developer separated
from another via only four other developers in the network. The interconnection
among developers from different Rust packages establishes the ground for
identifying the central developers in the ecosystem. Our survey responses
($N=206$) show that the respondents are more likely to not differentiate
between developers in deciding how to review upstream changes (60.2\% of the
time). However, when they do differentiate, our statistical analysis showed a
significant correlation between developers' centrality ratings and the level of
scrutiny their code might face from the downstream projects, as indicated by
the respondents.
Related papers
- RedCode: Risky Code Execution and Generation Benchmark for Code Agents [50.81206098588923]
RedCode is a benchmark for risky code execution and generation.
RedCode-Exec provides challenging prompts that could lead to risky code execution.
RedCode-Gen provides 160 prompts with function signatures and docstrings as input to assess whether code agents will follow instructions.
arXiv Detail & Related papers (2024-11-12T13:30:06Z) - Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.
We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - Trusting code in the wild: Exploring contributor reputation measures to review dependencies in the Rust ecosystem [1.0310977366592338]
We use network centrality measures to proxy contributor reputation using collaboration activity.
We find that only 24% of respondents often review dependencies before adding or updating a package.
We recommend that ecosystems like GitHub, Rust, and npm implement a contributor reputation badge to aid developers in dependency reviews.
arXiv Detail & Related papers (2024-06-14T16:13:58Z) - Multi-Agent Software Development through Cross-Team Collaboration [30.88149502999973]
We introduce Cross-Team Collaboration (CTC), a scalable multi-team framework for software development.
CTC enables orchestrated teams to jointly propose various decisions and communicate with their insights.
Results show a notable increase in quality compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-06-13T10:18:36Z) - The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub [2.595302141947391]
We analyse development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models.
Activity is imbalanced between repositories; for example, over 70% of models have 0 downloads, while 1% account for 99% of downloads.
We find that the community has a core-periphery structure, with a core of prolific developers and a majority of isolate developers.
arXiv Detail & Related papers (2024-05-20T11:10:49Z) - SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents.
We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z) - Who is the Real Hero? Measuring Developer Contribution via
Multi-dimensional Data Integration [8.735393610868435]
We propose CValue, a multidimensional information fusion-based approach to measure developer contributions.
CValue extracts both syntax and semantic information from the source code changes in four dimensions.
It fuses the information to produce the contribution score for each of the commits in the projects.
arXiv Detail & Related papers (2023-08-17T13:57:44Z) - The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour.
This chapter explores the ecosystems of development bots and GitHub Actions.
It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z) - Code Recommendation for Open Source Software Developers [32.181023933552694]
CODER is a novel graph-based code recommendation framework for open source software developers.
Our framework achieves superior performance under various experimental settings, including intra-project, cross-project, and cold-start recommendation.
arXiv Detail & Related papers (2022-10-15T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.