Selective Shot Learning for Code Explanation
- URL: http://arxiv.org/abs/2412.12852v1
- Date: Tue, 17 Dec 2024 12:26:14 GMT
- Title: Selective Shot Learning for Code Explanation
- Authors: Paheli Bhattacharya, Rishabh Gupta,
- Abstract summary: State-of-the-art approaches for Selective Shot Learning (SSL) include token-based and embedding-based methods.
We present a comparative study and propose a novel SSL method (SSL_ner) that utilizes entity information for few-shot example selection.
We show the effectiveness of SSL_ner over state-of-the-art methods across two datasets.
- Score: 4.773934813915903
- License:
- Abstract: Code explanation plays a crucial role in the software engineering domain, aiding developers in grasping code functionality efficiently. Recent work shows that the performance of LLMs for code explanation improves in a few-shot setting, especially when the few-shot examples are selected intelligently. State-of-the-art approaches for such Selective Shot Learning (SSL) include token-based and embedding-based methods. However, these SSL approaches have been evaluated on proprietary LLMs, without much exploration on open-source Code-LLMs. Additionally, these methods lack consideration for programming language syntax. To bridge these gaps, we present a comparative study and propose a novel SSL method (SSL_ner) that utilizes entity information for few-shot example selection. We present several insights and show the effectiveness of SSL_ner approach over state-of-the-art methods across two datasets. To the best of our knowledge, this is the first systematic benchmarking of open-source Code-LLMs while assessing the performances of the various few-shot examples selection approaches for the code explanation task.
Related papers
- Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL)
We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping.
ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z) - Does Few-Shot Learning Help LLM Performance in Code Synthesis? [40.35198206199065]
This work focuses on the few-shot examples present in most code generation prompts.
Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED.
Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark.
arXiv Detail & Related papers (2024-12-03T23:19:40Z) - Studying and Benchmarking Large Language Models For Log Level Suggestion [49.176736212364496]
Large Language Models (LLMs) have become a focal point of research across various domains.
This paper investigates the impact of characteristics and learning paradigms on the performance of 12 open-source LLMs in log level suggestion.
arXiv Detail & Related papers (2024-10-11T03:52:17Z) - In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting [33.89176174108559]
In-context learning of large language models (LLMs) makes predictions only based on instructions augmented with a few examples.
Existing example selection methods for ICL utilize sparse or dense retrievers and derive effective performance.
We propose our policy-based reinforcement learning framework for example selection (RLS), which consists of a language model (LM) selector and an LLM generator.
arXiv Detail & Related papers (2024-08-23T12:32:12Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - A Thorough Examination of Decoding Methods in the Era of LLMs [72.65956436513241]
Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers.
This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of large language models.
Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization.
arXiv Detail & Related papers (2024-02-10T11:14:53Z) - Automatic Smart Contract Comment Generation via Large Language Models
and In-Context Learning [11.52122354673779]
In this study, we propose an approach SCCLLM based on large language models (LLMs) and in-context learning.
Specifically, in the demonstration selection phase, SCCLLM retrieves the top-k code snippets from the historical corpus.
In the in-context learning phase, SCCLLM utilizes the retrieved code snippets as demonstrations.
arXiv Detail & Related papers (2023-11-17T08:31:09Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - Industry Scale Semi-Supervised Learning for Natural Language
Understanding [14.844450283047234]
This paper presents a production Semi-Supervised Learning (SSL) pipeline based on the student-teacher framework.
We investigate two questions related to the use of unlabeled data in production SSL context.
We compare four widely used SSL techniques, Pseudo-Label (PL), Knowledge Distillation (KD), Virtual Adversarial Training (VAT) and Cross-View Training (CVT)
arXiv Detail & Related papers (2021-03-29T18:24:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.