Fugu-MT 論文翻訳(概要): FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

論文の概要: FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

arxiv url: http://arxiv.org/abs/2606.03330v1
Date: Tue, 02 Jun 2026 08:39:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-03 22:00:04.87595
Title: FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences
Title（参考訳）: FLIPS:擬似ランダムシーケンスによるLCMのインスタンスフィンガープリント
Authors: Gurvan Richardeau, Gohar Dashyan, Erwan Le Merrer, Gilles Tredan,
Abstract要約: 我々は,同一の大規模言語モデルの構成を区別するレギュレータ指向のパラダイムであるインスタンスレベルのフィンガープリントを導入する。提案手法FLIPSは237モデルインスタンスで96%(クローズドセット)、90%(オープンセット)の識別精度に達するために生成したバイナリランダムシーケンスのバイアスを利用する。これは、インスタンスレベルの指紋認証が規制に必要であり、事実上実現可能であることを示している。
参考スコア（独自算出の注目度）: 4.351505522514463
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Literature reveals that a Large Language Model's (LLM) behavior is not only conditioned by its original weights but also its instance-level parameters, such as instructional prompt, sampling configuration or quantization. A model that generates safe outputs under one configuration may produce toxic content under another. However, current LLM identification techniques (such as fingerprinting) focus on intellectual property protection, and their design favors robustness to changes in these instance-level parameters. This poses a critical challenge for AI regulation in which compliance assessments target actual deployed behaviors, not model provenance. In this paper, we introduce instance-level fingerprinting, a regulator-oriented paradigm that distinguishes configurations of the same LLM. Our method FLIPS, exploits biases in generated binary random sequences to reach 96% (closed-set) and 90% (open-set, where some targets are unknown) identification accuracy across 237 model instances, versus 35% for the adapted LLMmap baseline. This shows that instance-level fingerprinting is both necessary for regulation and practically feasible. Code available at https://github.com/GurvanR/FLIPS-LLM-Instance-Fingerprinting.
Abstract（参考訳）: 文献によると、Large Language Model(LLM)の振る舞いは、もともとの重み付けだけでなく、インストラクショナルプロンプト、サンプリング設定、量子化といったインスタンスレベルのパラメータによって条件付けられている。ある構成下で安全な出力を生成するモデルは、別の構成下で有害なコンテンツを生成することができる。しかし、現在のLLM識別技術(指紋認証など)は知的財産保護に重点を置いており、それらの設計はこれらのインスタンスレベルのパラメータの変化に頑健である。これは、コンプライアンスアセスメントがモデル証明ではなく、実際のデプロイされた振る舞いをターゲットとする、AI規制にとって重要な課題である。本稿では,LLMの構成を識別するレギュレータ指向のパラダイムであるインスタンスレベルのフィンガープリントを紹介する。提案手法であるFLIPSでは,生成した二進数列のバイアスを利用して96%(クローズドセット)と90%(オープンセット,いくつかのターゲットが不明な場合)の同定精度を237モデルインスタンスで比較し,適応LLMmapベースラインでは35%とした。これは、インスタンスレベルの指紋認証が規制に必要であり、事実上実現可能であることを示している。コードはhttps://github.com/GurvanR/FLIPS-LLM-Instance-Fingerprintingで公開されている。

論文の概要: FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences

関連論文リスト