The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Literature Review
- URL: http://arxiv.org/abs/2507.03156v1
- Date: Thu, 03 Jul 2025 20:25:49 GMT
- Title: The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Literature Review
- Authors: Amr Mohamed, Maram Assi, Mariam Guizani,
- Abstract summary: Large language model assistants (LLM-assistants) present new opportunities to transform software development.<n>Despite growing interest, there is no synthesis of how LLM-assistants affect software developer productivity.<n>Our analysis reveals that LLM-assistants offer both considerable benefits and critical risks.
- Score: 4.503986781849658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language model assistants (LLM-assistants) present new opportunities to transform software development. Developers are increasingly adopting these tools across tasks, including coding, testing, debugging, documentation, and design. Yet, despite growing interest, there is no synthesis of how LLM-assistants affect software developer productivity. In this paper, we present a systematic literature review of 37 peer-reviewed studies published between January 2014 and December 2024 that examine this impact. Our analysis reveals that LLM-assistants offer both considerable benefits and critical risks. Commonly reported gains include minimized code search, accelerated development, and the automation of trivial and repetitive tasks. However, studies also highlight concerns around cognitive offloading, reduced team collaboration, and inconsistent effects on code quality. While the majority of studies (92%) adopt a multi-dimensional perspective by examining at least two SPACE dimensions, reflecting increased awareness of the complexity of developer productivity, only 14% extend beyond three dimensions, indicating substantial room for more integrated evaluations. Satisfaction, Performance, and Efficiency are the most frequently investigated dimensions, whereas Communication and Activity remain underexplored. Most studies are exploratory (64%) and methodologically diverse, but lack longitudinal and team-based evaluations. This review surfaces key research gaps and provides recommendations for future research and practice. All artifacts associated with this study are publicly available at https://zenodo.org/records/15788502.
Related papers
- A Systematic Literature Review on Detecting Software Vulnerabilities with Large Language Models [2.518519330408713]
Large Language Models (LLMs) in software engineering have sparked interest in their use for software vulnerability detection.<n>The rapid development of this field has resulted in a fragmented research landscape.<n>This fragmentation makes it difficult to obtain a clear overview of the state-of-the-art or compare and categorize studies meaningfully.
arXiv Detail & Related papers (2025-07-30T13:17:16Z) - Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z) - AAAR-1.0: Assessing AI's Potential to Assist Research [34.88341605349765]
We introduce AAAR-1.0, a benchmark dataset designed to evaluate large language models (LLMs) performance in three fundamental, expertise-intensive research tasks.<n> AAAR-1.0 differs from prior benchmarks in two key ways: first, it is explicitly research-oriented, with tasks requiring deep domain expertise; second, it is researcher-oriented, mirroring the primary activities that researchers engage in on a daily basis.
arXiv Detail & Related papers (2024-10-29T17:58:29Z) - The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks.
This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Automatically Analyzing Performance Issues in Android Apps: How Far Are We? [15.614257662319863]
We conduct a large-scale comparative study of Android performance issues in real-world applications and literature.
Our results show a substantial divergence exists in the primary performance concerns of researchers, developers, and users.
It is crucial for our community to intensify efforts to bridge these gaps and achieve comprehensive detection and resolution of performance issues.
arXiv Detail & Related papers (2024-07-06T14:43:40Z) - Self-Admitted Technical Debt Detection Approaches: A Decade Systematic Review [5.670597842524448]
Technical debt (TD) represents the long-term costs associated with suboptimal design or code decisions in software development.
Self-Admitted Technical Debt (SATD) occurs when developers explicitly acknowledge these trade-offs.
automated detection of SATD has become an increasingly important research area.
arXiv Detail & Related papers (2023-12-19T12:01:13Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - Towards an Understanding of Large Language Models in Software Engineering Tasks [29.30433406449331]
Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in text generation and reasoning tasks.<n>The evaluation and optimization of LLMs in software engineering tasks, such as code generation, have become a research focus.<n>This paper comprehensively investigate and collate the research and products combining LLMs with software engineering.
arXiv Detail & Related papers (2023-08-22T12:37:29Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - Open Source Software for Efficient and Transparent Reviews [0.11179881480027788]
ASReview is an open source machine learning-aided pipeline applying active learning.
We demonstrate by means of simulation studies that ASReview can yield far more efficient reviewing than manual reviewing.
arXiv Detail & Related papers (2020-06-22T11:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.