A Survey on Large Language Models for Software Engineering
- URL: http://arxiv.org/abs/2312.15223v1
- Date: Sat, 23 Dec 2023 11:09:40 GMT
- Title: A Survey on Large Language Models for Software Engineering
- Authors: Quanjun Zhang, Chunrong Fang, Yang Xie, Yaxin Zhang, Yun Yang, Weisong
Sun, Shengcheng Yu, Zhenyu Chen
- Abstract summary: Large Language Models (LLMs) are used to automate a broad range of Software Engineering (SE) tasks.
We provide a systematic survey to summarize the current state-of-the-art research in the LLM-based SE community.
We present a detailed summarization of the recent SE studies for which LLMs are commonly utilized, including 155 studies for 43 specific code-related tasks.
- Score: 16.134715510164366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software Engineering (SE) is the systematic design, development, and
maintenance of software applications, underpinning the digital infrastructure
of our modern mainworld. Very recently, the SE community has seen a rapidly
increasing number of techniques employing Large Language Models (LLMs) to
automate a broad range of SE tasks. Nevertheless, existing information of the
applications, effects, and possible limitations of LLMs within SE is still not
well-studied.
In this paper, we provide a systematic survey to summarize the current
state-of-the-art research in the LLM-based SE community. We summarize 30
representative LLMs of Source Code across three model architectures, 15
pre-training objectives across four categories, and 16 downstream tasks across
five categories. We then present a detailed summarization of the recent SE
studies for which LLMs are commonly utilized, including 155 studies for 43
specific code-related tasks across four crucial phases within the SE workflow.
Besides, we summarize existing attempts to empirically evaluate LLMs in SE,
such as benchmarks, empirical studies, and exploration of SE education. We also
discuss several critical aspects of optimization and applications of LLMs in
SE, such as security attacks, model tuning, and model compression. Finally, we
highlight several challenges and potential opportunities on applying LLMs for
future SE studies, such as exploring domain LLMs and constructing clean
evaluation datasets. Overall, our work can help researchers gain a
comprehensive understanding about the achievements of the existing LLM-based SE
studies and promote the practical application of these techniques. Our
artifacts are publicly available and will continuously updated at the living
repository: \url{https://github.com/iSEngLab/AwesomeLLM4SE}.
Related papers
- Large Language Models as Software Components: A Taxonomy for LLM-Integrated Applications [0.0]
Large Language Models (LLMs) have become widely adopted recently. Research explores their use both as autonomous agents and as tools for software engineering.
LLMs-integrated applications, on the other hand, are software systems that leverage an LLM to perform tasks that would otherwise be impossible or require significant coding effort.
This study provides a taxonomy for LLM-integrated applications, offering a framework for analyzing and describing these systems.
arXiv Detail & Related papers (2024-06-13T21:32:56Z) - Analyzing LLM Usage in an Advanced Computing Class in India [4.580708389528142]
This study examines the use of large language models (LLMs) by undergraduate and graduate students for programming assignments in advanced computing classes.
We conducted a comprehensive analysis involving 411 students from a Distributed Systems class at an Indian university.
arXiv Detail & Related papers (2024-04-06T12:06:56Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Survey on Factuality in Large Language Models: Knowledge, Retrieval and
Domain-Specificity [61.54815512469125]
This survey addresses the crucial issue of factuality in Large Language Models (LLMs)
As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital.
arXiv Detail & Related papers (2023-10-11T14:18:03Z) - Large Language Models as Data Preprocessors [10.914067455923847]
Large Language Models (LLMs), typified by OpenAI's GPT series and Meta's LLaMA variants, have marked a significant advancement in artificial intelligence.
This study expands on the applications of LLMs, exploring their potential in data preprocessing.
We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques.
arXiv Detail & Related papers (2023-08-30T23:28:43Z) - Towards an Understanding of Large Language Models in Software
Engineering Tasks [32.09925582943177]
Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in tasks such as text generation and reasoning.
This paper is the first to comprehensively investigate and collate the research and products combining LLMs with software engineering.
We have collected related literature as extensively from seven mainstream databases, and selected 123 papers for analysis.
arXiv Detail & Related papers (2023-08-22T12:37:29Z) - Large Language Models for Software Engineering: A Systematic Literature Review [34.12458948051519]
Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE)
We select and analyze 395 research papers from January 2017 to January 2024 to answer four key research questions (RQs)
From the answers to these RQs, we discuss the current state-of-the-art and trends, identifying gaps in existing research, and flagging promising areas for future study.
arXiv Detail & Related papers (2023-08-21T10:37:49Z) - A Comprehensive Overview of Large Language Models [68.22178313875618]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks.
This article provides an overview of the existing literature on a broad range of LLM-related concepts.
arXiv Detail & Related papers (2023-07-12T20:01:52Z) - A Survey on Evaluation of Large Language Models [87.60417393701331]
Large language models (LLMs) are gaining increasing popularity in both academia and industry.
This paper focuses on three key dimensions: what to evaluate, where to evaluate, and how to evaluate.
arXiv Detail & Related papers (2023-07-06T16:28:35Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.