Related papers: AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks

AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks

URL: http://arxiv.org/abs/2505.06267v1
Date: Mon, 05 May 2025 22:41:19 GMT
Title: AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks
Authors: Ilyas Oulkadda, Julien Perez,
Abstract summary: This paper introduces Adversarial Knowledge Distillation (AKD) to distill the capabilities of larger models into smaller, more efficient ones.<n>AKD provides a framework for enhancing model robustness, reliability, and security while improving their parameter-efficiency.
Score: 4.757470449749877
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The widespread adoption of Large Language Models (LLMs) for code generation, exemplified by GitHub Copilot\footnote{A coding extension powered by a Code-LLM to assist in code completion tasks} surpassing a million users, highlights the transformative potential of these tools in improving developer productivity. However, this rapid growth also underscores critical concerns regarding the quality, safety, and reliability of the code they generate. As Code-LLMs evolve, they face significant challenges, including the diminishing returns of model scaling and the scarcity of new, high-quality training data. To address these issues, this paper introduces Adversarial Knowledge Distillation (AKD), a novel approach that leverages adversarially generated synthetic datasets to distill the capabilities of larger models into smaller, more efficient ones. By systematically stress-testing and refining the reasoning capabilities of Code-LLMs, AKD provides a framework for enhancing model robustness, reliability, and security while improving their parameter-efficiency. We believe this work represents a critical step toward ensuring dependable automated code generation within the constraints of existing data and the cost-efficiency of model execution.

Related papers

CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation [24.090719826360342]
We introduce CodeIF, the first benchmark designed to assess the abilities of Large Language Models (LLMs) to adhere to task-oriented instructions within code generation scenarios.<n>We conduct extensive experiments with LLMs, analyzing their strengths and limitations in meeting the demands of these tasks.
arXiv Detail & Related papers (2025-02-26T14:19:49Z)
Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points [51.40935517552926]
We introduce Focused-DPO, a framework that enhances code generation by directing preference optimization towards critical error-prone areas.<n>By focusing on error-prone points, Focused-DPO advances the accuracy and functionality of model-generated code.
arXiv Detail & Related papers (2025-02-17T06:16:02Z)
Toward Neurosymbolic Program Comprehension [46.874490406174644]
We advocate for a Neurosymbolic research direction that combines the strengths of existing DL techniques with traditional symbolic methods.<n>We present preliminary results for our envisioned approach, aimed at establishing the first Neurosymbolic Program framework.
arXiv Detail & Related papers (2025-02-03T20:38:58Z)
On the Adversarial Robustness of Instruction-Tuned Large Language Models for Code [4.286327408435937]
We assess the impact of diverse input challenges on the functionality and correctness of generated code using rigorous metrics and established benchmarks.<n>Open-source models demonstrate an increased susceptibility to input perturbations, resulting in declines in functional correctness ranging from 12% to 34%.<n>In contrast, commercial models demonstrate relatively greater resilience, with performance degradation ranging from 3% to 24%.
arXiv Detail & Related papers (2024-11-29T07:00:47Z)
A Retention-Centric Framework for Continual Learning with Guaranteed Model Developmental Safety [75.8161094916476]
In real-world applications, learning-enabled systems often undergo iterative model development to address challenging or emerging tasks.<n>New or improving existing capabilities may inadvertently lose good capabilities of the old model, also known as catastrophic forgetting.<n>We propose a retention-centric framework with data-dependent constraints, and study how to continually develop a pretrained CLIP model for acquiring new or improving existing capabilities of image classification.
arXiv Detail & Related papers (2024-10-04T22:34:58Z)
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation. Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs [36.409470894115074]
We propose the CodePLAN framework, which aims to transfer LLMs' code generation reasoning capabilities to smaller models. Our approach improves the smaller model's code generation performance by over 130% on the challenging APPS benchmark.
arXiv Detail & Related papers (2024-03-20T03:09:54Z)
Enhanced Automated Code Vulnerability Repair using Large Language Models [0.0]
This research addresses the complex challenge of automated repair of code vulnerabilities. It introduces a novel format for the representation of code modification, using advanced Large Language Models (LLMs) LLMs, fine-tuned on datasets featuring C code vulnerabilities, significantly improve the accuracy and adaptability of automated code repair techniques.
arXiv Detail & Related papers (2024-01-08T09:01:29Z)
Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation [24.668682498171776]
Large language models (LLMs) have brought significant advancements to code generation, benefiting both novice and experienced developers. However, their training using unsanitized data from open-source repositories, like GitHub, introduces the risk of inadvertently propagating security vulnerabilities. This paper presents a comprehensive study focused on evaluating and enhancing code LLMs from a software security perspective.
arXiv Detail & Related papers (2023-10-25T00:32:56Z)
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z)
Data-Driven and SE-assisted AI Model Signal-Awareness Enhancement and Introspection [61.571331422347875]
We propose a data-driven approach to enhance models' signal-awareness. We combine the SE concept of code complexity with the AI technique of curriculum learning. We achieve up to 4.8x improvement in model signal awareness.
arXiv Detail & Related papers (2021-11-10T17:58:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.