MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware
Beamforming Network for Speech Separation
- URL: http://arxiv.org/abs/2212.03401v1
- Date: Wed, 7 Dec 2022 01:52:40 GMT
- Title: MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware
Beamforming Network for Speech Separation
- Authors: Yanjie Fu, Haoran Yin, Meng Ge, Longbiao Wang, Gaoyan Zhang, Jianwu
Dang, Chengyun Deng, Fei Wang
- Abstract summary: We propose an end-to-end beamforming network for direction guided speech separation given merely the mixture signal.
Specifically, we design a multi-channel input and multiple outputs architecture to predict the direction-of-arrival based embeddings and beamforming weights for each source.
- Score: 55.533789120204055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, many deep learning based beamformers have been proposed for
multi-channel speech separation. Nevertheless, most of them rely on extra cues
known in advance, such as speaker feature, face image or directional
information. In this paper, we propose an end-to-end beamforming network for
direction guided speech separation given merely the mixture signal, namely
MIMO-DBnet. Specifically, we design a multi-channel input and multiple outputs
architecture to predict the direction-of-arrival based embeddings and
beamforming weights for each source. The precisely estimated directional
embedding provides quite effective spatial discrimination guidance for the
neural beamformer to offset the effect of phase wrapping, thus allowing more
accurate reconstruction of two sources' speech signals. Experiments show that
our proposed MIMO-DBnet not only achieves a comprehensive decent improvement
compared to baseline systems, but also maintain the performance on high
frequency bands when phase wrapping occurs.
Related papers
- A unified multichannel far-field speech recognition system: combining
neural beamforming with attention based end-to-end model [14.795953417531907]
We propose a unified multichannel far-field speech recognition system that combines the neural beamforming and transformer-based Listen, Spell, Attend (LAS) speech recognition system.
The proposed method achieve 19.26% improvement when compared with a strong baseline.
arXiv Detail & Related papers (2024-01-05T07:11:13Z) - On Neural Architectures for Deep Learning-based Source Separation of
Co-Channel OFDM Signals [104.11663769306566]
We study the single-channel source separation problem involving frequency-division multiplexing (OFDM) signals.
We propose critical domain-informed modifications to the network parameterization, based on insights from OFDM structures.
arXiv Detail & Related papers (2023-03-11T16:29:13Z) - Towards Efficient Subarray Hybrid Beamforming: Attention Network-based
Practical Feedback in FDD Massive MU-MIMO Systems [9.320559153486885]
This paper introduces a jointly optimized network for channel estimation and feedback.
Experiments show that the proposed network is over 10 times lighter at the resource-sensitive user equipment.
arXiv Detail & Related papers (2023-02-05T15:12:07Z) - Multi-Channel End-to-End Neural Diarization with Distributed Microphones [53.99406868339701]
We replace Transformer encoders in EEND with two types of encoders that process a multi-channel input.
We also propose a model adaptation method using only single-channel recordings.
arXiv Detail & Related papers (2021-10-10T03:24:03Z) - Neural Calibration for Scalable Beamforming in FDD Massive MIMO with
Implicit Channel Estimation [10.775558382613077]
Channel estimation and beamforming play critical roles in frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems.
We propose a deep learning-based approach that directly optimize the beamformers at the base station according to the received uplink pilots.
A neural calibration method is proposed to improve the scalability of the end-to-end design.
arXiv Detail & Related papers (2021-08-03T14:26:14Z) - Model-Driven Deep Learning Based Channel Estimation and Feedback for
Millimeter-Wave Massive Hybrid MIMO Systems [61.78590389147475]
This paper proposes a model-driven deep learning (MDDL)-based channel estimation and feedback scheme for millimeter-wave (mmWave) systems.
To reduce the uplink pilot overhead for estimating the high-dimensional channels from a limited number of radio frequency (RF) chains, we propose to jointly train the phase shift network and the channel estimator as an auto-encoder.
Numerical results show that the proposed MDDL-based channel estimation and feedback scheme outperforms the state-of-the-art approaches.
arXiv Detail & Related papers (2021-04-22T13:34:53Z) - Deep Learning-based Compressive Beam Alignment in mmWave Vehicular
Systems [75.77033270838926]
vehicular channels exhibit structure that can be exploited for beam alignment with fewer channel measurements.
We propose a deep learning-based technique to design a structured compressed sensing (CS) matrix.
arXiv Detail & Related papers (2021-02-27T04:38:12Z) - DBNET: DOA-driven beamforming network for end-to-end farfield sound
source separation [20.200763595732912]
We propose a direction-of-arrival-driven beamforming network (DBnet) for end-to-end source separation.
We also propose end-to-end extensions of DBnet which incorporate post masking networks.
The experimental results show that the proposed extended DBnet using a convolutional-recurrent post masking network outperforms state-of-the-art source separation methods.
arXiv Detail & Related papers (2020-10-22T09:52:05Z) - Deep Denoising Neural Network Assisted Compressive Channel Estimation
for mmWave Intelligent Reflecting Surfaces [99.34306447202546]
This paper proposes a deep denoising neural network assisted compressive channel estimation for mmWave IRS systems.
We first introduce a hybrid passive/active IRS architecture, where very few receive chains are employed to estimate the uplink user-to-IRS channels.
The complete channel matrix can be reconstructed from the limited measurements based on compressive sensing.
arXiv Detail & Related papers (2020-06-03T12:18:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.