Fugu-MT 論文翻訳(概要): How Well Do LLMs Imitate Human Writing Style?

論文の概要: How Well Do LLMs Imitate Human Writing Style?

arxiv url: http://arxiv.org/abs/2509.24930v1
Date: Mon, 29 Sep 2025 15:34:40 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:20.093744
Title: How Well Do LLMs Imitate Human Writing Style?
Title（参考訳）: LLMはいかに人間の筆記スタイルを省略するか?
Authors: Rebira Jemama, Rajesh Kumar,
Abstract要約: 大規模言語モデル(LLM)は、流動的なテキストを生成することができるが、特定の人間の作者の独特のスタイルを再現する能力は、まだ不明である。著者の検証とスタイルの模倣分析のための,高速かつトレーニング不要なフレームワークを提案する。学術エッセイでは97.5%、クロスドメイン評価では94.5%の精度を達成している。
参考スコア（独自算出の注目度）: 2.3754840025365183
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models (LLMs) can generate fluent text, but their ability to replicate the distinctive style of a specific human author remains unclear. We present a fast, training-free framework for authorship verification and style imitation analysis. The method integrates TF-IDF character n-grams with transformer embeddings and classifies text pairs through empirical distance distributions, eliminating the need for supervised training or threshold tuning. It achieves 97.5\% accuracy on academic essays and 94.5\% in cross-domain evaluation, while reducing training time by 91.8\% and memory usage by 59\% relative to parameter-based baselines. Using this framework, we evaluate five LLMs from three separate families (Llama, Qwen, Mixtral) across four prompting strategies - zero-shot, one-shot, few-shot, and text completion. Results show that the prompting strategy has a more substantial influence on style fidelity than model size: few-shot prompting yields up to 23.5x higher style-matching accuracy than zero-shot, and completion prompting reaches 99.9\% agreement with the original author's style. Crucially, high-fidelity imitation does not imply human-like unpredictability - human essays average a perplexity of 29.5, whereas matched LLM outputs average only 15.2. These findings demonstrate that stylistic fidelity and statistical detectability are separable, establishing a reproducible basis for future work in authorship modeling, detection, and identity-conditioned generation.
Abstract（参考訳）: 大規模言語モデル(LLM)は、流動的なテキストを生成することができるが、特定の人間の作者の独特のスタイルを再現する能力は、まだ不明である。著者の検証とスタイルの模倣分析のための,高速かつトレーニング不要なフレームワークを提案する。 TF-IDF文字n-gramをトランスフォーマー埋め込みと統合し、実験的な距離分布を通してテキストペアを分類し、教師付きトレーニングやしきい値調整を不要とする。学術エッセイでは97.5\%の精度、クロスドメイン評価では94.5\%、トレーニング時間では91.8\%、メモリ使用率ではパラメータベースのベースラインに対して59\%の精度を実現している。このフレームワークを用いて,ゼロショット,ワンショット,少数ショット,テキスト補完という,3つの異なるファミリー(Llama,Qwen,Mixtral)から5つのLSMを評価する。その結果, プロセッシング戦略は, モデルサイズよりもスタイル忠実度に大きく影響していることが明らかとなった。少数ショットプロセッシングはゼロショットよりも最大23.5倍高いスタイルマッチング精度を示し, 完了プロセッシングはオリジナル作者のスタイルと99.9\%の一致を示した。人間のエッセイの平均パープレキシティは29.5であり、マッチングされたLLMの出力は15.2である。これらの結果から,形式的忠実度と統計的検出性は分離可能であることが示され,著者モデリング,検出,アイデンティティ条件付き生成における今後の研究の再現可能な基盤が確立された。

論文の概要: How Well Do LLMs Imitate Human Writing Style?

関連論文リスト