Fugu-MT 論文翻訳(概要): AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models

論文の概要: AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models

arxiv url: http://arxiv.org/abs/2604.08867v1
Date: Fri, 10 Apr 2026 02:02:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 17:57:53.634529
Title: AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models
Title（参考訳）: AudioGuard: さまざまな脅威モデルを対象とした総合的オーディオ安全対策
Authors: Mintong Kang, Chen Fang, Bo Li,
Abstract要約: 現実世界のリスクは、オーディオネイティブな有害な音声イベント、話者属性、偽造/発声・閉鎖的誤用にヒンジする可能性がある。 AudioGuardは,1) 波形レベルの音声ネイティブ検出のためのSoundGuardと,2) ポリシーに基づくセマンティック保護のためのContentGuardで構成された,統一されたガードレールである。
参考スコア（独自算出の注目度）: 17.541986184072773
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Audio has rapidly become a primary interface for foundation models, powering real-time voice assistants. Ensuring safety in audio systems is inherently more complex than just "unsafe text spoken aloud": real-world risks can hinge on audio-native harmful sound events, speaker attributes (e.g., child voice), impersonation/voice-cloning misuse, and voice-content compositional harms, such as child voice plus sexual content. The nature of audio makes it challenging to develop comprehensive benchmarks or guardrails against this unique risk landscape. To close this gap, we conduct large-scale red teaming on audio systems, systematically uncover vulnerabilities in audio, and develop a comprehensive, policy-grounded audio risk taxonomy and AudioSafetyBench, the first policy-based audio safety benchmark across diverse threat models. AudioSafetyBench supports diverse languages, suspicious voices (e.g., celebrity/impersonation and child voice), risky voice-content combinations, and non-speech sound events. To defend against these threats, we propose AudioGuard, a unified guardrail consisting of 1) SoundGuard for waveform-level audio-native detection and 2) ContentGuard for policy-grounded semantic protection. Extensive experiments on AudioSafetyBench and four complementary benchmarks show that AudioGuard consistently improves guardrail accuracy over strong audio-LLM-based baselines with substantially lower latency.
Abstract（参考訳）: オーディオは、ファンデーションモデルの主要なインターフェースとなり、リアルタイム音声アシスタントを駆動している。現実のリスクは、音声固有の有害な音声イベント、話者属性(例えば、子声)、身振り/声を閉じる誤用、子供の声や性的なコンテンツなどの音声コンテンツにヒンジすることができる。オーディオの性質は、このユニークなリスクランドスケープに対して包括的なベンチマークやガードレールを開発することを困難にしている。このギャップを埋めるために、私たちはオーディオシステムの大規模なレッドチーム化を行い、オーディオの脆弱性を体系的に発見し、様々な脅威モデルにまたがる最初のポリシーベースのオーディオ安全ベンチマークであるAudioSafetyBenchを包括的かつポリシーに基づくオーディオリスク分類を開発する。 AudioSafetyBenchは、多様な言語、不審な声(例えば、有名人/人物、子供の声)、危険な音声コンテンツの組み合わせ、非音声音声イベントをサポートする。これらの脅威に対して防御するために,我々はAudioGuardを提案する。 1)波形レベルの音声ネイティブ検出のためのSoundGuard 2)ポリシーに基づくセマンティック保護のためのContentGuard。 AudioSafetyBenchと4つの補完ベンチマークに関する大規模な実験によると、AudioGuardは、強いオーディオ-LLMベースのベースラインよりもずっと低いレイテンシでガードレールの精度を向上している。

論文の概要: AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models

関連論文リスト