Fugu-MT 論文翻訳(概要): PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability

論文の概要: PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability

arxiv url: http://arxiv.org/abs/2509.08910v1
Date: Wed, 10 Sep 2025 18:14:52 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-12 16:52:24.098558
Title: PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability
Title（参考訳）: PromptGuard: 安全性、公正性、可制御性を向上したLLMを用いた脆弱性集団のための原則的テキスト生成のためのオーケストレーションプロンプトフレームワーク
Authors: Tung Vu, Lam Nguyen, Quynh Dao,
Abstract要約: VulnGuard Promptは、現実世界のデータ駆動コントラスト学習による有害な情報生成を防止するハイブリッド技術である。 PromptGuardは、入力分類、VulnGuard Prompting、倫理原則統合、外部ツールインタラクション、ユーザーシステムインタラクションの6つのコアモジュールを編成する。本稿では,収束証明,情報理論を用いた脆弱性解析,理論的検証フレームワークなどを含む包括的数学的形式化を提案する。
参考スコア（独自算出の注目度）: 0.9131552057693698
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The proliferation of Large Language Models (LLMs) in real-world applications poses unprecedented risks of generating harmful, biased, or misleading information to vulnerable populations including LGBTQ+ individuals, single parents, and marginalized communities. While existing safety approaches rely on post-hoc filtering or generic alignment techniques, they fail to proactively prevent harmful outputs at the generation source. This paper introduces PromptGuard, a novel modular prompting framework with our breakthrough contribution: VulnGuard Prompt, a hybrid technique that prevents harmful information generation using real-world data-driven contrastive learning. VulnGuard integrates few-shot examples from curated GitHub repositories, ethical chain-of-thought reasoning, and adaptive role-prompting to create population-specific protective barriers. Our framework employs theoretical multi-objective optimization with formal proofs demonstrating 25-30% analytical harm reduction through entropy bounds and Pareto optimality. PromptGuard orchestrates six core modules: Input Classification, VulnGuard Prompting, Ethical Principles Integration, External Tool Interaction, Output Validation, and User-System Interaction, creating an intelligent expert system for real-time harm prevention. We provide comprehensive mathematical formalization including convergence proofs, vulnerability analysis using information theory, and theoretical validation framework using GitHub-sourced datasets, establishing mathematical foundations for systematic empirical research.
Abstract（参考訳）: 現実世界の応用におけるLarge Language Models(LLMs)の拡散は、LGBTQ+の個人、単一親、辺境化コミュニティを含む脆弱な集団に有害、偏見、あるいは誤った情報を生み出すという前例のないリスクをもたらす。既存の安全手法はポストホックフィルタやジェネリックアライメント技術に依存しているが、生成元における有害な出力を積極的に防止することができない。 VulnGuard Promptは,実世界のデータ駆動コントラスト学習を用いた有害な情報生成を防止するハイブリッド技術である。 VulnGuardは、キュレートされたGitHubリポジトリや倫理的連鎖推論、適応的なロールプロンプトなど、いくつかの例を統合して、人口固有の保護障壁を作成している。このフレームワークは, エントロピー境界とパレート最適性による25～30%の分析的害軽減を示す形式証明を用いた理論的多目的最適化を用いている。 PromptGuardは、入力分類、VulnGuard Prompting、倫理原則統合、外部ツールインタラクション、出力バリデーション、ユーザシステムインタラクションの6つのコアモジュールを編成し、リアルタイムの害防止のためのインテリジェントなエキスパートシステムを作成する。我々は、収束証明、情報理論を用いた脆弱性分析、GitHubのソースデータセットを用いた理論的検証フレームワークを含む包括的な数学的フォーマル化を行い、体系的な経験的研究のための数学的基礎を確立する。

論文の概要: PromptGuard: An Orchestrated Prompting Framework for Principled Synthetic Text Generation for Vulnerable Populations using LLMs with Enhanced Safety, Fairness, and Controllability

関連論文リスト