Related papers: FFC-SE: Fast Fourier Convolution for Speech Enhancement

FFC-SE: Fast Fourier Convolution for Speech Enhancement

URL: http://arxiv.org/abs/2204.03042v1
Date: Wed, 6 Apr 2022 18:52:47 GMT
Title: FFC-SE: Fast Fourier Convolution for Speech Enhancement
Authors: Ivan Shchekotov, Pavel Andreev, Oleg Ivanov, Aibek Alanov, Dmitry Vetrov
Abstract summary: Fast Fourier convolution (FFC) is the recently proposed neural operator showing promising performance in several computer vision problems. In this work, we design neural network architectures which adapt FFC for speech enhancement. We found that neural networks based on FFC outperform analogous convolutional models and show better or comparable results with other speech enhancement baselines.
Score: 1.0499611180329804
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Fast Fourier convolution (FFC) is the recently proposed neural operator showing promising performance in several computer vision problems. The FFC operator allows employing large receptive field operations within early layers of the neural network. It was shown to be especially helpful for inpainting of periodic structures which are common in audio processing. In this work, we design neural network architectures which adapt FFC for speech enhancement. We hypothesize that a large receptive field allows these networks to produce more coherent phases than vanilla convolutional models, and validate this hypothesis experimentally. We found that neural networks based on Fast Fourier convolution outperform analogous convolutional models and show better or comparable results with other speech enhancement baselines.

Related papers

Adaptive Frequency Filters As Efficient Global Token Mixers [100.27957692579892]
We show that adaptive frequency filters can serve as efficient global token mixers. We take AFF token mixers as primary neural operators to build a lightweight neural network, dubbed AFFNet.
arXiv Detail & Related papers (2023-07-26T07:42:28Z)
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis [1.4277428617774877]
We present Vocos, a new model that directly generates Fourier spectral coefficients. It substantially improves computational efficiency, achieving an order of magnitude increase in speed compared to prevailing time-domain neural vocoding approaches.
arXiv Detail & Related papers (2023-06-01T15:40:32Z)
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree Spectral Bias of Neural Networks [79.28094304325116]
Despite the capacity of neural nets to learn arbitrary functions, models trained through gradient descent often exhibit a bias towards simpler'' functions. We show how this spectral bias towards low-degree frequencies can in fact hurt the neural network's generalization on real-world datasets. We propose a new scalable functional regularization scheme that aids the neural network to learn higher degree frequencies.
arXiv Detail & Related papers (2023-05-16T20:06:01Z)
Properties and Potential Applications of Random Functional-Linked Types of Neural Networks [81.56822938033119]
Random functional-linked neural networks (RFLNNs) offer an alternative way of learning in deep structure. This paper gives some insights into the properties of RFLNNs from the viewpoints of frequency domain. We propose a method to generate a BLS network with better performance, and design an efficient algorithm for solving Poison's equation.
arXiv Detail & Related papers (2023-04-03T13:25:22Z)
Polynomial Neural Fields for Subband Decomposition and Manipulation [78.2401411189246]
We propose a new class of neural fields called neural fields (PNFs) The key advantage of a PNF is that it can represent a signal as a composition of manipulable and interpretable components without losing the merits of neural fields. We empirically demonstrate that Fourier PNFs enable signal manipulation applications such as texture transfer and scale-space.
arXiv Detail & Related papers (2023-02-09T18:59:04Z)
QFF: Quantized Fourier Features for Neural Field Representations [28.82293263445964]
We show that using Quantized Fourier Features (QFF) can result in smaller model size, faster training, and better quality outputs for several applications. QFF are easy to code, fast to compute, and serve as a simple drop-in addition to many neural field representations.
arXiv Detail & Related papers (2022-12-02T00:11:22Z)
Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time. This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z)
Neural Fourier Shift for Binaural Speech Rendering [16.957415282256758]
We present a neural network for rendering speech from given monaural audio, position, and orientation of the source. We propose Neural Shift (NFS), a novel network architecture that enables speech rendering in the Fourier space.
arXiv Detail & Related papers (2022-11-02T04:55:09Z)
Functional Regularization for Reinforcement Learning via Learned Fourier Features [98.90474131452588]
We propose a simple architecture for deep reinforcement learning by embedding inputs into a learned Fourier basis. We show that it improves the sample efficiency of both state-based and image-based RL.
arXiv Detail & Related papers (2021-12-06T18:59:52Z)
Efficient Trainable Front-Ends for Neural Speech Enhancement [22.313111311130665]
We present an efficient, trainable front-end based on the butterfly mechanism to compute the Fast Fourier Transform. We show its accuracy and efficiency benefits for low-compute neural speech enhancement models.
arXiv Detail & Related papers (2020-02-20T01:51:15Z)
Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks [23.88788382262305]
temporal convolutional recurrent network (TCRN) is an end-to-end model that directly map noisy waveform to clean waveform. We show that our model is able to improve the performance of model, compared with existing convolutional recurrent networks.
arXiv Detail & Related papers (2020-02-02T04:26:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.