Fugu-MT 論文翻訳(概要): Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

論文の概要: Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

arxiv url: http://arxiv.org/abs/2603.03314v1
Date: Mon, 09 Feb 2026 13:24:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 01:20:08.142019
Title: Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO
Title（参考訳）: 自走式LCMに向けて:CoIPOによる固有のプロンプトノイズ耐性
Authors: Xin Yang, Letian Li, Abudukelimu Wuerkaixi, Xuxin Cheng, Cao Liu, Ke Zeng, Xunliang Cai, Wenyuan Jiang,
Abstract要約: 大規模言語モデル(LLM)は、広範囲のタスクで顕著かつ着実にパフォーマンスを改善している。現実世界のアプリケーションでは、LLMに提供されるユーザープロンプトは、しばしば不完全であり、モデルの応答の質を損なう可能性がある。本稿では,モデルが生成するラベル整列ロジットとノイズの相違を最小限に抑えるコントラスト学習に基づく逆直接選好最適化(CoIPO)手法を提案する。
参考スコア（独自算出の注目度）: 38.36870540583669
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have demonstrated remarkable and steadily improving performance across a wide range of tasks. However, LLM performance may be highly sensitive to prompt variations especially in scenarios with limited openness or strict output formatting requirements, indicating insufficient robustness. In real-world applications, user prompts provided to LLMs often contain imperfections, which may undermine the quality of the model's responses. To address this issue, previous work has primarily focused on preprocessing prompts, employing external tools or even LLMs to refine prompt formulations in advance. However, these approaches overlook the intrinsic robustness of LLMs, and their reliance on external components introduces additional computational overhead and uncertainty. In this work, we propose a Contrastive Learning-based Inverse Direct Preference Optimization (CoIPO) method that minimizes the discrepancy between the label-aligned logits produced by the model under a clean prompt and its noisy counterpart, and conduct a detailed analysis using mutual information theory. We augment the FLAN dataset by constructing paired prompts, each consisting of a clean prompt and its corresponding noisy version for training. Additionally, to evaluate the effectiveness, we develop NoisyPromptBench, a benchmark enhanced and derived from the existing PromptBench. Experimental results conducted on NoisyPromptBench demonstrate that our proposed method achieves a significant improvement in average accuracy over the current state-of-the-art approaches. The source code of CoIPO, pair-wise FLAN datasets, and NoisyPromptBench have already been released on https://github.com/vegetable-yx/CoIPO.
Abstract（参考訳）: 大規模言語モデル(LLM)は、広範囲のタスクで顕著かつ着実にパフォーマンスを改善している。しかし、LLMの性能は、特に限られたオープン性や厳密な出力フォーマット要件のシナリオにおいて、非常に敏感であり、ロバスト性が不十分であることを示している。現実世界のアプリケーションでは、LLMに提供されるユーザープロンプトは、しばしば不完全であり、モデルの応答の質を損なう可能性がある。この問題に対処するため、以前の研究は主にプロンプトの事前処理に重点を置いており、事前のプロンプトを洗練させるために外部ツールやLLMも使用していた。しかしながら、これらのアプローチはLLMの本質的な堅牢性を見落とし、それらの外部コンポーネントへの依存は、さらなる計算オーバーヘッドと不確実性をもたらす。本研究では,モデルが生成するラベル整列ロジットとノイズの相違を最小限に抑え,相互情報理論を用いた詳細な分析を行うための,コントラスト学習に基づく逆直接参照最適化(CoIPO)手法を提案する。我々は、ペア化されたプロンプトを構築してFLANデータセットを拡張し、それぞれがクリーンなプロンプトとそれに対応するノイズバージョンで構成されている。さらに,提案手法の有効性を評価するため,既存のPromptBenchのベンチマークであるNoisyPromptBenchを開発した。 NoisyPromptBenchで行った実験結果から,提案手法は現在の最先端手法と比較して,平均精度を大幅に向上することが示された。 CoIPO、ペアワイズFLANデータセット、NoisyPromptBenchのソースコードはすでにhttps://github.com/vegetable-yx/CoIPOでリリースされている。

論文の概要: Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

関連論文リスト