Fugu-MT 論文翻訳(概要): PrivCode++: Latent-Conditioned Differentially Private Code Generation for Comprehensive Guarantees

論文の概要: PrivCode++: Latent-Conditioned Differentially Private Code Generation for Comprehensive Guarantees

arxiv url: http://arxiv.org/abs/2606.09145v1
Date: Mon, 08 Jun 2026 07:42:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:06.808499
Title: PrivCode++: Latent-Conditioned Differentially Private Code Generation for Comprehensive Guarantees
Title（参考訳）: PrivCode++: 包括的な保証のための遅延コンディション付き差分プライベートコード生成
Authors: Zheng Liu, Chen Gong, Terry Yue Zhuo, Zhou Yang, Kecen Li, Wenlong Meng, Xinwen Hou, Yu Liu, Xiaochen Li,
Abstract要約: 命令-コードペアに微調整された大規模な言語モデルは、記憶し、その後、センシティブなトレーニングデータをリークする。既存の差分プライベート(DP)コード生成メソッドは、主にコードスニペットを保護し、プロンプトが公開であると仮定する。 PrivCode-Plusはプライバシフリー遅延コンディショニングモジュールを備えた2段階のDPフレームワークを導入した。
参考スコア（独自算出の注目度）: 27.487220334182666
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models fine-tuned on instruction-code pairs may memorize and subsequently leak sensitive training data. Existing differentially private (DP) code generation methods primarily protect code snippets while assuming prompts are public, which fails in realistic scenarios where prompts may also contain sensitive information. When prompts cannot be explicitly learned or used during generation, code synthesis suffers from severe utility degradation as well as reduced diversity and fidelity. To address these challenges, we propose PrivCode-Plus, the first work to explore DP code generation where both prompts and code snippets are considered sensitive in LLM fine-tuning. PrivCode-Plus introduces a two-stage DP framework with a Privacy-Free Latent Conditioning module, enabling effective DP fine-tuning and data synthesis without direct access to sensitive prompts or code. Extensive experiments show that PrivCode-Plus achieves substantially higher utility than baselines, remains competitive with the method with relaxing privacy assumptions, and provides stronger privacy guarantees.
Abstract（参考訳）: 命令-コードペアに微調整された大規模な言語モデルは、記憶し、その後、センシティブなトレーニングデータをリークする。既存の差分プライベート(DP)コード生成メソッドは、主にコードスニペットを保護し、プロンプトが公開であると仮定する。生成中にプロンプトを明示的に学習または使用できない場合、コード合成は、多様性と忠実度を低下させるとともに、深刻なユーティリティ劣化に悩まされる。これらの課題に対処するために,PrivCode-Plusを提案する。PrivCode-PlusはDPコード生成において,プロンプトとコードスニペットの両方がLLMの微調整に敏感であると考えられる。 PrivCode-Plusは、プライバシフリー遅延コンディショニングモジュールを備えた2段階のDPフレームワークを導入し、センシティブなプロンプトやコードに直接アクセスすることなく、効果的なDP微調整とデータ合成を可能にする。大規模な実験により、PrivCode-Plusはベースラインよりもはるかに高いユーティリティを実現し、プライバシの仮定を緩和する手法と競合し、より強力なプライバシ保証を提供することが示された。

論文の概要: PrivCode++: Latent-Conditioned Differentially Private Code Generation for Comprehensive Guarantees

関連論文リスト