Fugu-MT 論文翻訳(概要): Automatic Contextual Audio Denoising

論文の概要: Automatic Contextual Audio Denoising

arxiv url: http://arxiv.org/abs/2605.22262v1
Date: Thu, 21 May 2026 10:06:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 20:14:18.549948
Title: Automatic Contextual Audio Denoising
Title（参考訳）: 環境音の自動デノイング
Authors: Diep Luong, Konstantinos Drossos, Mikko Heikkinen, Tuomas Virtanen,
Abstract要約: 音声コンテキストは、どの音成分と音源が関連しているかを判断し、リスナーによって無関係(ノイズ)と見なすことができる。現代のほとんどのオーディオ復調システムでは、固定されたターゲットノイズの定義を適用し、無関係なコンポーネントを抑えるのに失敗しながら、1つのコンテキストで有用なコンポーネントを除去することが多い。提案手法では,推定コンテキストに基づいてターゲットとノイズを定義する自動文脈音声デノイング (ACAD) の概念を導入する。
参考スコア（独自算出の注目度）: 10.668322881347068
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Audio context determines which sound components and sources are relevant and which can be perceived as irrelevant (noise) by listeners. For example, traffic noise is informative in urban surveillance but noise for a phone call at the same location. Most current audio denoising systems apply fixed target-noise definitions, often removing useful components in one context while failing to suppress irrelevant components. To address this, we introduce the concept automatic contextual audio denoising (ACAD) which defines target and noise based on the inferred context. In this work, we restrict context to be associated with an acoustic scene class. We label sound events outside the event distribution of a scene class (noise) as out-of-context (OC) and events typical for that scene as in-context (IC). We implement a deep learning method that automatically infers the context of the audio signal and removes OC components, and benchmark it against variants: without context inference, with oracle context, and with separately provided uninformative context. On paired clean/noisy data across diverse contexts, where OC components in one context may be IC in another, our proposed method outperforms other approaches across standard objective metrics, indicating that the model can infer context and context-dependent processing can enhance denoising.
Abstract（参考訳）: 音声コンテキストは、どの音成分と音源が関連しているかを判断し、リスナーによって無関係(ノイズ)と見なすことができる。例えば、交通騒音は都市監視において情報となるが、同一の場所での通話にはノイズがある。現代のほとんどのオーディオ復調システムでは、固定されたターゲットノイズの定義を適用し、無関係なコンポーネントを抑えるのに失敗しながら、1つのコンテキストで有用なコンポーネントを除去することが多い。そこで,本稿では,推定コンテキストに基づいてターゲットとノイズを定義する自動文脈音声デノゲーション (ACAD) の概念を導入する。本研究では,音響シーンクラスに関連するコンテキストを限定する。我々は、シーンクラス(ノイズ)のイベント分布外の音声イベントを、アウト・オブ・コンテクスト(OC)として、そのシーンに典型的なイベントをイン・コンテクスト(IC)としてラベル付けする。音声信号のコンテキストを自動的に推論し、OC成分を除去し、文脈推論なしで、オラクルコンテキストで、また別々に提供された非形式的コンテキストでベンチマークする深層学習手法を実装した。異なるコンテキストにおけるOCコンポーネントが別のコンテキストでICとなるような、ペア化されたクリーン/ノイズの多いデータに対して、提案手法は標準的な客観的指標をまたいだ他のアプローチよりも優れており、モデルがコンテキストを推測し、文脈に依存した処理がデノゲーションを高める可能性があることを示している。

論文の概要: Automatic Contextual Audio Denoising

関連論文リスト