Gamma Boltzmann Machine for Simultaneously Modeling Linear- and
Log-amplitude Spectra
- URL: http://arxiv.org/abs/2006.13590v2
- Date: Thu, 25 Jun 2020 11:35:49 GMT
- Title: Gamma Boltzmann Machine for Simultaneously Modeling Linear- and
Log-amplitude Spectra
- Authors: Toru Nakashika and Kohei Yatabe
- Abstract summary: The gamma-Bernoulli RBM simultaneously handles both linear- and log-amplitude spectrograms.
It can also treat amplitude in the logarithmic scale which is important for audio signals from the perceptual point of view.
- Score: 43.95163625695819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In audio applications, one of the most important representations of audio
signals is the amplitude spectrogram. It is utilized in many
machine-learning-based information processing methods including the ones using
the restricted Boltzmann machines (RBM). However, the ordinary
Gaussian-Bernoulli RBM (the most popular RBM among its variations) cannot
directly handle amplitude spectra because the Gaussian distribution is a
symmetric model allowing negative values which never appear in the amplitude.
In this paper, after proposing a general gamma Boltzmann machine, we propose a
practical model called the gamma-Bernoulli RBM that simultaneously handles both
linear- and log-amplitude spectrograms. Its conditional distribution of the
observable data is given by the gamma distribution, and thus the proposed RBM
can naturally handle the data represented by positive numbers as the amplitude
spectra. It can also treat amplitude in the logarithmic scale which is
important for audio signals from the perceptual point of view. The advantage of
the proposed model compared to the ordinary Gaussian-Bernoulli RBM was
confirmed by PESQ and MSE in the experiment of representing the amplitude
spectrograms of speech signals.
Related papers
- Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space [72.52365911990935]
We introduce Bellman Diffusion, a novel DGM framework that maintains linearity in MDPs through gradient and scalar field modeling.
Our results show that Bellman Diffusion achieves accurate field estimations and is a capable image generator, converging 1.5x faster than the traditional histogram-based baseline in distributional RL tasks.
arXiv Detail & Related papers (2024-10-02T17:53:23Z) - Synthetic Wave-Geometric Impulse Responses for Improved Speech
Dereverberation [69.1351513309953]
We show that accurately simulating the low-frequency components of Room Impulse Responses (RIRs) is important to achieving good dereverberation.
We demonstrate that speech dereverberation models trained on hybrid synthetic RIRs outperform models trained on RIRs generated by prior geometric ray tracing methods.
arXiv Detail & Related papers (2022-12-10T20:15:23Z) - Multimodal Exponentially Modified Gaussian Oscillators [4.233733499457509]
This study presents a three-stage Multimodal Exponentially Modified Gaussian (MEMG) model with an optional oscillating term.
With this, synthetic ultrasound signals suffering from artifacts can be fully recovered.
Real data experimentation is carried out to demonstrate the classification capability of the acquired features.
arXiv Detail & Related papers (2022-09-25T11:48:09Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - Scale Dependencies and Self-Similar Models with Wavelet Scattering
Spectra [1.5866079116942815]
A complex wavelet transform computes signal variations at each scale.
Dependencies across scales are captured by the joint correlation across time and scales of wavelet coefficients.
We show that this vector of moments characterizes a wide range of non-Gaussian properties of multi-scale processes.
arXiv Detail & Related papers (2022-04-19T22:31:13Z) - Spacing Statistics of Energy Spectra: Random Matrices, Black Hole
Thermalization, and Echoes [0.0]
Recent advances in AdS/CFT holography have suggested that the near-horizon dynamics of black holes can be described by random matrix systems.
We study how the energy spectrum of a system affects its early and late time thermalization behaviour using the spectral form factor.
arXiv Detail & Related papers (2021-10-07T05:27:02Z) - A Random Matrix Perspective on Random Tensors [40.89521598604993]
We study the spectra of random matrices arising from contractions of a given random tensor.
Our technique yields a hitherto unknown characterization of the local maximum of the ML problem.
Our approach is versatile and can be extended to other models, such as asymmetric, non-Gaussian and higher-order ones.
arXiv Detail & Related papers (2021-08-02T10:42:22Z) - Learning Energy-Based Models by Diffusion Recovery Likelihood [61.069760183331745]
We present a diffusion recovery likelihood method to tractably learn and sample from a sequence of energy-based models.
After training, synthesized images can be generated by the sampling process that initializes from Gaussian white noise distribution.
On unconditional CIFAR-10 our method achieves FID 9.58 and inception score 8.30, superior to the majority of GANs.
arXiv Detail & Related papers (2020-12-15T07:09:02Z) - Hyperspectral Image Denoising with Partially Orthogonal Matrix Vector
Tensor Factorization [42.56231647066719]
Hyperspectral image (HSI) has some advantages over natural image for various applications due to the extra spectral information.
During the acquisition, it is often contaminated by severe noises including Gaussian noise, impulse noise, deadlines, and stripes.
We present a HSI restoration method named smooth and robust low rank tensor recovery.
arXiv Detail & Related papers (2020-06-29T02:10:07Z) - Modal Regression based Structured Low-rank Matrix Recovery for
Multi-view Learning [70.57193072829288]
Low-rank Multi-view Subspace Learning has shown great potential in cross-view classification in recent years.
Existing LMvSL based methods are incapable of well handling view discrepancy and discriminancy simultaneously.
We propose Structured Low-rank Matrix Recovery (SLMR), a unique method of effectively removing view discrepancy and improving discriminancy.
arXiv Detail & Related papers (2020-03-22T03:57:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.