Large Language Model-Aware In-Context Learning for Code Generation
- URL: http://arxiv.org/abs/2310.09748v1
- Date: Sun, 15 Oct 2023 06:12:58 GMT
- Title: Large Language Model-Aware In-Context Learning for Code Generation
- Authors: Jia Li, Ge Li, Chongyang Tao, Jia Li, Huangzhao Zhang, Fang Liu, Zhi
Jin
- Abstract summary: Large language models (LLMs) have shown impressive in-context learning (ICL) ability in code generation.
We propose a novel learning-based selection approach named LAIL (LLM-Aware In-context Learning) for code generation.
- Score: 75.68709482932903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have shown impressive in-context learning (ICL)
ability in code generation. LLMs take a prompt consisting of requirement-code
examples and a new requirement as input, and output new programs. Existing
studies have found that ICL is highly dominated by the examples and thus arises
research on example selection. However, existing approaches randomly select
examples or only consider the textual similarity of requirements to retrieve,
leading to sub-optimal performance. In this paper, we propose a novel
learning-based selection approach named LAIL (LLM-Aware In-context Learning)
for code generation. Given a candidate example, we exploit LLMs themselves to
estimate it by considering the generation probabilities of ground-truth
programs given a requirement and the example. We then label candidate examples
as positive or negative through the probability feedback. Based on the labeled
data, we import a contrastive learning objective to train an effective
retriever that acquires the preference of LLMs in code generation. We apply
LAIL to three LLMs and evaluate it on three representative datasets (e.g.,
MBJP, MBPP, and MBCPP). LATA outperforms the state-of-the-art baselines by
11.58%, 6.89%, and 5.07% on CodeGen, and 4.38%, 2.85%, and 2.74% on GPT-3.5 in
terms of Pass@1, respectively.
Related papers
- The First Prompt Counts the Most! An Evaluation of Large Language Models on Iterative Example-based Code Generation [33.77058239791512]
This paper presents the first comprehensive study on example-based code generation using Large Language Models (LLMs)
To address the incorrectness caused by the incompleteness of I/O examples, we adopt an iterative evaluation framework.
We assess six state-of-the-art LLMs using a new benchmark of 168 diverse target functionalities.
arXiv Detail & Related papers (2024-11-11T08:05:37Z) - Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities.
The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z) - LBC: Language-Based-Classifier for Out-Of-Variable Generalization [14.033963471962823]
Large Language Models (LLMs) have great success in natural language processing tasks such as response generation.
We find that the pre-trained knowledge of LLMs enables them to interpret new variables that appear in a test without additional training.
We propose a Language-Based-Classifier (LBC) to maximize the benefits of LLMs to outperform TMLs on Out-of-Variable tasks.
arXiv Detail & Related papers (2024-08-20T15:05:02Z) - LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification [13.319594321038926]
We propose a simple and effective transfer learning strategy, namely LLMEmbed, to address this classical but challenging task.
We perform extensive experiments on publicly available datasets, and the results show that LLMEmbed achieves strong performance while enjoys low training overhead.
arXiv Detail & Related papers (2024-06-06T03:46:59Z) - RepEval: Effective Text Evaluation with LLM Representation [55.26340302485898]
RepEval is a metric that leverages the projection of Large Language Models (LLMs) representations for evaluation.
Our work underscores the richness of information regarding text quality embedded within LLM representations, offering insights for the development of new metrics.
arXiv Detail & Related papers (2024-04-30T13:50:55Z) - LLMaAA: Making Large Language Models as Active Annotators [32.57011151031332]
We propose LLMaAA, which takes large language models as annotators and puts them into an active learning loop to determine what to annotate efficiently.
We conduct experiments and analysis on two classic NLP tasks, named entity recognition and relation extraction.
With LLMaAA, task-specific models trained from LLM-generated labels can outperform the teacher within only hundreds of annotated examples.
arXiv Detail & Related papers (2023-10-30T14:54:15Z) - Learning to Retrieve In-Context Examples for Large Language Models [69.9707552694766]
Large language models (LLMs) have demonstrated their ability to learn in-context.
The effectiveness of in-context learning is heavily reliant on the quality of the selected examples.
We propose a novel framework to iteratively train dense retrievers that can identify high-quality in-context examples.
arXiv Detail & Related papers (2023-07-14T05:23:08Z) - PALR: Personalization Aware LLMs for Recommendation [7.407353565043918]
PALR aims to combine user history behaviors (such as clicks, purchases, ratings, etc.) with large language models (LLMs) to generate user preferred items.
Our solution outperforms state-of-the-art models on various sequential recommendation tasks.
arXiv Detail & Related papers (2023-05-12T17:21:33Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - LEVER: Learning to Verify Language-to-Code Generation with Execution [64.36459105535]
We propose LEVER, a simple approach to improve language-to-code generation by learning to verify the generated programs with their execution results.
Specifically, we train verifiers to determine whether a program sampled from the LLMs is correct or not based on the natural language input, the program itself and its execution results.
LEVER consistently improves over the base code LLMs(4.6% to 10.9% with code-davinci) and achieves new state-of-the-art results on all of them.
arXiv Detail & Related papers (2023-02-16T18:23:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.