Related papers: Lightweight Model Editing for LLMs to Correct Deprecated API Recommendations

Lightweight Model Editing for LLMs to Correct Deprecated API Recommendations

URL: http://arxiv.org/abs/2511.21022v1
Date: Wed, 26 Nov 2025 03:36:34 GMT
Title: Lightweight Model Editing for LLMs to Correct Deprecated API Recommendations
Authors: Guancheng Lin, Xiao Yu, Jacky Keung, Xing Hu, Xin Xia, Alex X. Liu,
Abstract summary: Pre-trained Large Language Models (LLMs) have demonstrated strong performance in code completion tasks.<n>LLMs frequently generate deprecated APIs that will no longer be supported in future versions of third-party libraries.<n>We propose AdaLoRA-L, which defines "Common API Layers" (layers with high importance across all APIs, storing general knowledge and excluded from editing) and restricts edits exclusively to "Specific API Layers"<n> Experimental results demonstrate that AdaLoRA-L significantly improves Specificity while maintaining comparable performance across other evaluation metrics.
Score: 15.586818028794942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained or fine-tuned on large code corpora, Large Language Models (LLMs) have demonstrated strong performance in code completion tasks. However, their embedded knowledge is constrained by the timeliness of training data, which often includes code using deprecated APIs. Consequently, LLMs frequently generate deprecated APIs that will no longer be supported in future versions of third-party libraries. While retraining LLMs on updated codebases could refresh their API knowledge, this approach is computationally expensive. Recently, lightweight model editing methods have emerged to efficiently correct specific knowledge in LLMs. However, it remains unclear whether these methods can effectively update deprecated API knowledge and enable edited models to generate up-to-date APIs. To address this gap, we conduct the first systematic study applying 10 state-of-the-art model editing techniques to update deprecated API knowledge in three LLMs: Qwen2.5-Coder, StarCoder2, and DeepSeek-Coder. We introduce EDAPIBench, a dedicated benchmark featuring over 70 deprecated APIs from 8 popular Python libraries, with more than 3,000 editing instances. Our results show that the parameter-efficient fine-tuning method AdaLoRA achieves the best performance in enabling edited models to generate correct, up-to-date APIs, but falls short in Specificity (i.e., the editing influences untargeted knowledge). To resolve this, we propose AdaLoRA-L, which defines "Common API Layers" (layers within the LLMs with high importance across all APIs, storing general knowledge and excluded from editing) and restricts edits exclusively to "Specific API Layers" (layers with high importance only for the target API, storing the API-specific knowledge). Experimental results demonstrate that AdaLoRA-L significantly improves Specificity while maintaining comparable performance across other evaluation metrics.

Related papers

Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS [52.483888557864326]
APIKG4SYN is a framework designed to exploit API knowledge graphs for the construction of API-oriented question-code pairs.<n>We build the first benchmark for HarmonyOS code generation using APIKG4SYN.
arXiv Detail & Related papers (2025-11-29T08:13:54Z)
APIRAT: Integrating Multi-source API Knowledge for Enhanced Code Translation with LLMs [6.522570957351905]
APIRAT is a novel code translation method that integrates multi-source API knowledge.<n> APIRAT employs three API knowledge augmentation techniques, including API sequence retrieval, API sequence back-translation, and API mapping.<n>Experiments indicate that APIRAT significantly surpasses existing LLM-based methods, achieving improvements in computational accuracy ranging from 4% to 15.1%.
arXiv Detail & Related papers (2025-04-21T04:24:49Z)
Identifying and Mitigating API Misuse in Large Language Models [26.4403427473915]
API misuse in code generated by large language models (LLMs) represents a serious emerging challenge in software development.<n>This paper presents the first comprehensive study of API misuse patterns in LLM-generated code, analyzing both method selection and parameter usage across Python and Java.<n>We propose Dr.Fix, a novel LLM-based automatic program repair approach for API misuse based on the aforementioned taxonomy.
arXiv Detail & Related papers (2025-03-28T18:43:12Z)
When LLMs Meet API Documentation: Can Retrieval Augmentation Aid Code Generation Just as It Helps Developers? [10.204379646375182]
Retrieval-augmented generation (RAG) has increasingly shown its power in extending large language models' (LLMs') capability beyond their pre-trained knowledge.<n>We study the factors that affect the effectiveness of using the documentation of less common API libraries as additional knowledge for retrieval and generation.
arXiv Detail & Related papers (2025-03-19T14:08:47Z)
Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models [49.214291813478695]
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like overflows and use buffer-free errors.<n>Traditional fuzzing struggles with the complexity and API diversity of DL libraries.<n>We propose DFUZZ, an LLM-driven fuzzing approach for DL libraries.
arXiv Detail & Related papers (2025-01-08T07:07:22Z)
ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution.<n> Experimental results demonstrate that ExploraCoder significantly improves performance for models lacking prior API knowledge.
arXiv Detail & Related papers (2024-12-06T19:00:15Z)
A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models [14.665460257371164]
Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation. We propose AutoAPIEval, a framework designed to evaluate the capabilities of LLMs in API-oriented code generation.
arXiv Detail & Related papers (2024-09-23T17:22:09Z)
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates [77.81663273436375]
We present CodeUpdateArena, a benchmark for knowledge editing in the code domain.<n>An instance in our benchmark consists of a synthetic API function update paired with a program synthesis example.<n>Our benchmark covers updates of various types to 54 functions from seven diverse Python packages.
arXiv Detail & Related papers (2024-07-08T17:55:04Z)
LLMs Meet Library Evolution: Evaluating Deprecated API Usage in LLM-based Code Completion [13.633501449498402]
Decomposing API usage is a problem in large language models (LLMs)-based code completion.<n>This study involved seven advanced LLMs, 145 API mappings from eight popular Python libraries, and 28,125 completion prompts.<n>We propose two lightweight fixing approaches, REPLACEAPI and INSERTPROMPT, which can serve as baseline approaches for future research.
arXiv Detail & Related papers (2024-06-14T08:44:10Z)
SoAy: A Solution-based LLM API-using Methodology for Academic Information Seeking [59.59923482238048]
SoAy is a solution-based LLM API-using methodology for academic information seeking.<n>It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.<n>Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z)
Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries. We propose a novel framework that emulates the process of programmers writing private code. We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.