Neural Network-based Virtual Microphone Estimator
- URL: http://arxiv.org/abs/2101.04315v1
- Date: Tue, 12 Jan 2021 06:30:24 GMT
- Title: Neural Network-based Virtual Microphone Estimator
- Authors: Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita,
Keisuke Kinoshita, Shoko Araki
- Abstract summary: We propose a neural network-based virtual microphone estimator (NN-VME)
The NN-VME estimates virtual microphone signals directly in the time domain, by utilizing the precise estimation capability of the recent time-domain neural networks.
Experiments on the CHiME-4 corpus show that the proposed NN-VME achieves high virtual microphone estimation performance even for real recordings.
- Score: 111.79608275698274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing microphone array technologies for a small number of microphones is
important due to the constraints of many devices. One direction to address this
situation consists of virtually augmenting the number of microphone signals,
e.g., based on several physical model assumptions. However, such assumptions
are not necessarily met in realistic conditions. In this paper, as an
alternative approach, we propose a neural network-based virtual microphone
estimator (NN-VME). The NN-VME estimates virtual microphone signals directly in
the time domain, by utilizing the precise estimation capability of the recent
time-domain neural networks. We adopt a fully supervised learning framework
that uses actual observations at the locations of the virtual microphones at
training time. Consequently, the NN-VME can be trained using only multi-channel
observations and thus directly on real recordings, avoiding the need for
unrealistic physical model-based assumptions. Experiments on the CHiME-4 corpus
show that the proposed NN-VME achieves high virtual microphone estimation
performance even for real recordings and that a beamformer augmented with the
NN-VME improves both the speech enhancement and recognition performance.
Related papers
- Neuromorphic Keyword Spotting with Pulse Density Modulation MEMS Microphones [0.25782420501870285]
Keywords Spotting task involves continuous audio stream monitoring to detect predefined words.
Neuromorphic devices effectively address this energy challenge.
We propose a direct microphone-to-SNN connection.
System achieved an accuracy of 91.54% on the Google Speech Command dataset.
arXiv Detail & Related papers (2024-08-09T16:27:51Z) - A Real-Time Voice Activity Detection Based On Lightweight Neural [4.589472292598182]
Voice activity detection (VAD) is the task of detecting speech in an audio stream.
Recent neural network-based VADs have alleviated the degradation of performance to some extent.
We propose a lightweight and real-time neural network called MagicNet, which utilizes casual and depth separable 1-D convolutions and GRU.
arXiv Detail & Related papers (2024-05-27T03:31:16Z) - A Novel Micro-Doppler Coherence Loss for Deep Learning Radar Applications [1.099532646524593]
This paper introduces a micro-Doppler coherence loss, minimized when the normalized power of micro-Doppler oscillatory components between input and output is matched.
Experiments conducted on real data show that the application of the introduced loss results in models more resilient to noise.
arXiv Detail & Related papers (2024-04-12T08:11:07Z) - sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection
with Spiking Neural Networks [51.516451451719654]
Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient.
This paper introduces a novel SNN-based Voice Activity Detection model, referred to as sVAD.
It provides effective auditory feature representation through SincNet and 1D convolution, and improves noise robustness with attention mechanisms.
arXiv Detail & Related papers (2024-03-09T02:55:44Z) - Signal Detection in MIMO Systems with Hardware Imperfections: Message
Passing on Neural Networks [101.59367762974371]
In this paper, we investigate signal detection in multiple-input-multiple-output (MIMO) communication systems with hardware impairments.
It is difficult to train a deep neural network (DNN) with limited pilot signals, hindering its practical applications.
We design an efficient message passing based Bayesian signal detector, leveraging the unitary approximate message passing (UAMP) algorithm.
arXiv Detail & Related papers (2022-10-08T04:32:58Z) - Scene-Agnostic Multi-Microphone Speech Dereverberation [47.735158037490834]
We present an NN architecture that can cope with microphone arrays whose number and positions are unknown.
Our approach harnesses recent advances in deep learning on set-structured data to design an architecture that enhances the reverberant log-spectrum.
arXiv Detail & Related papers (2020-10-22T17:13:12Z) - Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by
Spiking Neural Network [68.43026108936029]
We propose a pure spiking neural network (SNN) based computational model for precise sound localization in the noisy real-world environment.
We implement this algorithm in a real-time robotic system with a microphone array.
The experiment results show a mean error azimuth of 13 degrees, which surpasses the accuracy of the other biologically plausible neuromorphic approach for sound source localization.
arXiv Detail & Related papers (2020-07-07T08:22:56Z) - Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions.
Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks.
This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.