Fugu-MT 論文翻訳(概要): Privacy-Preserving Speech Representation Learning using Vector Quantization

論文の概要: Privacy-Preserving Speech Representation Learning using Vector Quantization

arxiv url: http://arxiv.org/abs/2203.09518v1
Date: Tue, 15 Mar 2022 14:01:11 GMT
ステータス: 翻訳完了
システム内更新日: 2022-03-27 05:13:10.229356
Title: Privacy-Preserving Speech Representation Learning using Vector Quantization
Title（参考訳）: ベクトル量子化を用いたプライバシー保護音声表現学習
Authors: Pierre Champion (MULTISPEECH), Denis Jouvet (MULTISPEECH), Anthony Larcher (LIUM)
Abstract要約: 音声信号には、プライバシー上の懸念を引き起こす話者のアイデンティティなど、多くの機密情報が含まれている。本稿では,音声認識性能を保ちながら匿名表現を実現することを目的とする。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the popularity of virtual assistants (e.g., Siri, Alexa), the use of speech recognition is now becoming more and more widespread.However, speech signals contain a lot of sensitive information, such as the speaker's identity, which raises privacy concerns.The presented experiments show that the representations extracted by the deep layers of speech recognition networks contain speaker information.This paper aims to produce an anonymous representation while preserving speech recognition performance.To this end, we propose to use vector quantization to constrain the representation space and induce the network to suppress the speaker identity.The choice of the quantization dictionary size allows to configure the trade-off between utility (speech recognition) and privacy (speaker identity concealment).
Abstract（参考訳）: With the popularity of virtual assistants (e.g., Siri, Alexa), the use of speech recognition is now becoming more and more widespread.However, speech signals contain a lot of sensitive information, such as the speaker's identity, which raises privacy concerns.The presented experiments show that the representations extracted by the deep layers of speech recognition networks contain speaker information.This paper aims to produce an anonymous representation while preserving speech recognition performance.To this end, we propose to use vector quantization to constrain the representation space and induce the network to suppress the speaker identity.The choice of the quantization dictionary size allows to configure the trade-off between utility (speech recognition) and privacy (speaker identity concealment).

関連論文リスト

Are disentangled representations all you need to build speaker anonymization systems? [0.0]
音声信号には、話者のアイデンティティなど、多くの機密情報が含まれている。話者匿名化は、音源話者の身元をそのまま残しながら、音声信号を変換し、音源話者の身元を除去することを目的としている。
論文参考訳（メタデータ） (2022-08-22T07:51:47Z)
Differentially Private Speaker Anonymization [44.90119821614047]
実世界の発話を共有することが、音声ベースのサービスのトレーニングと展開の鍵となる。話者匿名化は、言語的および韻律的属性をそのまま残しながら、発話から話者情報を除去することを目的としている。言語的属性と韻律的属性は依然として話者情報を含んでいる。
論文参考訳（メタデータ） (2022-02-23T23:20:30Z)
Protecting gender and identity with disentangled speech representations [49.00162808063399]
音声における性情報保護は,話者識別情報のモデル化よりも効果的であることを示す。性別情報をエンコードし、2つの敏感な生体識別子を解読する新しい方法を提示する。
論文参考訳（メタデータ） (2021-04-22T13:31:41Z)
Streaming Multi-talker Speech Recognition with Joint Speaker Identification [77.46617674133556]
SURITは、音声認識と話者識別の両方のバックボーンとして、リカレントニューラルネットワークトランスデューサ(RNN-T)を採用しています。 Librispeechから派生したマルチストーカーデータセットであるLibrispeechデータセットに関するアイデアを検証し、奨励的な結果を提示した。
論文参考訳（メタデータ） (2021-04-05T18:37:33Z)
Voice Privacy with Smart Digital Assistants in Educational Settings [1.8369974607582578]
ソースにおける音声プライバシーのための実用的で効率的なフレームワークを設計・評価する。このアプローチでは、話者識別(SID)と音声変換法を組み合わせて、音声を記録するデバイス上でユーザのアイデンティティをランダムに偽装する。我々は、単語誤り率の観点から変換のASR性能を評価し、入力音声の内容を保存する上で、このフレームワークの約束を示す。
論文参考訳（メタデータ） (2021-03-24T19:58:45Z)
High Fidelity Speech Regeneration with Application to Speech Enhancement [96.34618212590301]
本稿では,24khz音声をリアルタイムに生成できる音声のwav-to-wav生成モデルを提案する。音声変換法に着想を得て,音源の同一性を保ちながら音声特性を増強する訓練を行った。
論文参考訳（メタデータ） (2021-01-31T10:54:27Z)
Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation [17.344080729609026]
話者音声表現における属性駆動プライバシー保存の概念について紹介する。これにより、悪意のあるインターセプターやアプリケーションプロバイダに1つ以上の個人的な側面を隠すことができる。本稿では,話者属性の音声表現に絡み合った逆自動符号化手法を提案し,その隠蔽を可能にする。
論文参考訳（メタデータ） (2020-12-08T14:47:23Z)
Speaker De-identification System using Autoencoders and Adversarial Training [58.720142291102135]
本稿では,対人訓練とオートエンコーダに基づく話者識別システムを提案する。実験結果から, 対向学習とオートエンコーダを組み合わせることで, 話者検証システムの誤り率が同等になることがわかった。
論文参考訳（メタデータ） (2020-11-09T19:22:05Z)
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers [38.3469744871394]
エンドツーエンドの話者分散音声認識モデルを提案する。重複した音声における話者カウント、音声認識、話者識別を統一する。
論文参考訳（メタデータ） (2020-06-19T02:05:18Z)
Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention [70.82604384963679]
本稿では,補助的話者認識機能を用いた音声強調のための自己適応手法について検討する。テスト発話から直接適応に用いる話者表現を抽出する。
論文参考訳（メタデータ） (2020-02-14T05:05:36Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。