Self-Admitted GenAI Usage in Open-Source Software
- URL: http://arxiv.org/abs/2507.10422v2
- Date: Tue, 15 Jul 2025 07:34:48 GMT
- Title: Self-Admitted GenAI Usage in Open-Source Software
- Authors: Tao Xiao, Youmei Fan, Fabio Calefato, Christoph Treude, Raula Gaikovina Kula, Hideaki Hata, Sebastian Baltes,
- Abstract summary: We introduce the concept of self-admitted GenAI usage, that is, developers explicitly referring to the use of GenAI tools for content creation in software artifacts.<n>We analyze a curated sample of more than 250,000 GitHub repositories, identifying 1,292 such self-admissions across 156 repositories in commit messages, code comments, and project documentation.<n>Our findings reveal that developers actively manage how GenAI is used in their projects, highlighting the need for project-level transparency.
- Score: 14.503048663131574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The widespread adoption of generative AI (GenAI) tools such as GitHub Copilot and ChatGPT is transforming software development. Since generated source code is virtually impossible to distinguish from manually written code, their real-world usage and impact on open-source software development remain poorly understood. In this paper, we introduce the concept of self-admitted GenAI usage, that is, developers explicitly referring to the use of GenAI tools for content creation in software artifacts. Using this concept as a lens to study how GenAI tools are integrated into open-source software projects, we analyze a curated sample of more than 250,000 GitHub repositories, identifying 1,292 such self-admissions across 156 repositories in commit messages, code comments, and project documentation. Using a mixed methods approach, we derive a taxonomy of 32 tasks, 10 content types, and 11 purposes associated with GenAI usage based on 284 qualitatively coded mentions. We then analyze 13 documents with policies and usage guidelines for GenAI tools and conduct a developer survey to uncover the ethical, legal, and practical concerns behind them. Our findings reveal that developers actively manage how GenAI is used in their projects, highlighting the need for project-level transparency, attribution, and quality control practices in the new era of AI-assisted software development. Finally, we examine the impact of GenAI adoption on code churn in 151 repositories with self-admitted GenAI usage and find no general increase, contradicting popular narratives on the impact of GenAI on software development.
Related papers
- An Empirical Study of GenAI Adoption in Open-Source Game Development: Tools, Tasks, and Developer Challenges [1.4299470464639639]
generative AI (GenAI) has begun to reshape how games are designed and developed, offering new tools for content creation, gameplay simulation, and design ideation.<n>There is limited empirical understanding of how GenAI is adopted by developers in real-world contexts, especially within the open-source community.<n>This study aims to explore how GenAI technologies are discussed, adopted, and integrated into open-source game development by analyzing issue discussions on GitHub.
arXiv Detail & Related papers (2025-07-24T02:03:12Z) - Survey of GenAI for Automotive Software Development: From Requirements to Executable Code [4.909409341455637]
Automotive software development is considered to be a significant area for GenAI adoption.<n>Three GenAI-related technologies are covered within the state-of-art: Large Language Models (LLMs), Retrieval Augmented Generation (RAG), Vision Language Models (VLMs)
arXiv Detail & Related papers (2025-07-20T16:21:51Z) - The Impact of Generative AI on Code Expertise Models: An Exploratory Study [0.0]
We present an exploratory analysis of how a knowledge model and a Truck Factor algorithm can be affected by GenAI usage.<n>Our findings suggest that as GenAI becomes more integrated into development, the reliability of such metrics may decrease.
arXiv Detail & Related papers (2025-07-10T20:43:08Z) - Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [66.1850490474361]
We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z) - Using Generative AI in Software Design Education: An Experience Report [0.6827423171182154]
Students were required to use GenAI to help complete a team-based assignment.<n>Students identified numerous ways ChatGPT helped them in their design process.<n>We identified several key lessons for educators in how to deploy GenAI in a software design class effectively.
arXiv Detail & Related papers (2025-06-26T18:40:16Z) - From Recall to Reasoning: Automated Question Generation for Deeper Math Learning through Large Language Models [44.99833362998488]
We investigated the first steps for optimizing content creation for advanced math.<n>We looked at the ability of GenAI to produce high-quality practice problems that are relevant to the course content.
arXiv Detail & Related papers (2025-05-17T08:30:10Z) - The Roles of Generative Artificial Intelligence in Internet of Electric Vehicles [65.14115295214636]
We specifically consider Internet of electric vehicles (IoEV) and we categorize GenAI for IoEV into four different layers.
We introduce various GenAI techniques used in each layer of IoEV applications.
Public datasets available for training the GenAI models are summarized.
arXiv Detail & Related papers (2024-09-24T05:12:10Z) - Ethics of Software Programming with Generative AI: Is Programming without Generative AI always radical? [0.32985979395737786]
The paper acknowledges the transformative power of GenAI in software code generation.
It posits that GenAI is not a replacement but a complementary tool for writing software code.
Ethical considerations are paramount with the paper advocating for stringent ethical guidelines.
arXiv Detail & Related papers (2024-08-20T05:35:39Z) - Genetic Auto-prompt Learning for Pre-trained Code Intelligence Language Models [54.58108387797138]
We investigate the effectiveness of prompt learning in code intelligence tasks.
Existing automatic prompt design methods are very limited to code intelligence tasks.
We propose Genetic Auto Prompt (GenAP) which utilizes an elaborate genetic algorithm to automatically design prompts.
arXiv Detail & Related papers (2024-03-20T13:37:00Z) - Generative Artificial Intelligence for Software Engineering -- A
Research Agenda [8.685607624226037]
We conducted a literature review and focus groups for a duration of five months to develop a research agenda on GenAI for Software Engineering.
Our results show that it is possible to explore the adoption of GenAI in partial automation and support decision-making in all software development activities.
Common considerations when implementing GenAI include industry-level assessment, dependability and accuracy, data accessibility, transparency, and sustainability aspects associated with the technology.
arXiv Detail & Related papers (2023-10-28T09:14:39Z) - Investigating Explainability of Generative AI for Code through
Scenario-based Design [44.44517254181818]
generative AI (GenAI) technologies are maturing and being applied to application domains such as software engineering.
We conduct 9 workshops with 43 software engineers in which real examples from state-of-the-art generative AI models were used to elicit users' explainability needs.
Our work explores explainability needs for GenAI for code and demonstrates how human-centered approaches can drive the technical development of XAI in novel domains.
arXiv Detail & Related papers (2022-02-10T08:52:39Z) - GenNI: Human-AI Collaboration for Data-Backed Text Generation [102.08127062293111]
Table2Text systems generate textual output based on structured data utilizing machine learning.
GenNI (Generation Negotiation Interface) is an interactive visual system for high-level human-AI collaboration in producing descriptive text.
arXiv Detail & Related papers (2021-10-19T18:07:07Z) - AI Explainability 360: Impact and Design [120.95633114160688]
In 2019, we created AI Explainability 360 (Arya et al. 2020), an open source software toolkit featuring ten diverse and state-of-the-art explainability methods.
This paper examines the impact of the toolkit with several case studies, statistics, and community feedback.
The paper also describes the flexible design of the toolkit, examples of its use, and the significant educational material and documentation available to its users.
arXiv Detail & Related papers (2021-09-24T19:17:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.