Wavelet-Filtering of Symbolic Music Representations for Folk Tune Segmentation and Classification
- URL: http://arxiv.org/abs/2504.20522v1
- Date: Tue, 29 Apr 2025 08:02:37 GMT
- Title: Wavelet-Filtering of Symbolic Music Representations for Folk Tune Segmentation and Classification
- Authors: Gissel Velarde, Tillman Weyde, David Meredith,
- Abstract summary: The aim of this study is to evaluate a machine-learning method in which symbolic representations of folk songs are segmented and classified into tune families with Haar-wavelet filtering.<n>We apply the continuous wavelet transform (CWT) with the Haar wavelet at specific scales, obtaining filtered versions of melodies emphasizing their information at particular time-scales.<n>We found that the wavelet based segmentation and wavelet-filtering of the pitch signal lead to better classification accuracy in cross-validated evaluation when the time-scale and other parameters are optimized.
- Score: 2.4774640776820105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The aim of this study is to evaluate a machine-learning method in which symbolic representations of folk songs are segmented and classified into tune families with Haar-wavelet filtering. The method is compared with previously proposed Gestalt-based method. Melodies are represented as discrete symbolic pitch-time signals. We apply the continuous wavelet transform (CWT) with the Haar wavelet at specific scales, obtaining filtered versions of melodies emphasizing their information at particular time-scales. We use the filtered signal for representation and segmentation, using the wavelet coefficients' local maxima to indicate local boundaries and classify segments by means of k-nearest neighbours based on standard vector-metrics (Euclidean, cityblock), and compare the results to a Gestalt-based segmentation method and metrics applied directly to the pitch signal. We found that the wavelet based segmentation and wavelet-filtering of the pitch signal lead to better classification accuracy in cross-validated evaluation when the time-scale and other parameters are optimized.
Related papers
- An approach to melodic segmentation and classification based on filtering with the Haar-wavelet [2.4774640776820105]
We present a novel method of classification and segmentation of melodies in symbolic representation.<n>The method is based on filtering pitch as a signal over time with the Haar-wavelet.<n>When used to classify 360 Dutch folk tunes into 26 tune families, the performance of the method is comparable to the use of pitch signals.
arXiv Detail & Related papers (2025-04-29T14:41:03Z) - Review of wavelet-based unsupervised texture segmentation, advantage of adaptive wavelets [8.144703798082293]
We show that the adaptability of the empirical wavelet permits to reach better results than classic wavelets.
The proposed method is tested on six classic benchmarks, based on several popular texture images.
arXiv Detail & Related papers (2024-10-24T22:48:28Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - Music Enhancement via Image Translation and Vocoding [14.356705444361832]
This paper presents a deep learning approach to enhance low-quality music recordings.
We combine an image-to-image translation model for manipulating audio in its mel-spectrogram representation and a music vocoding model for mapping synthetically generated mel-spectrograms to perceptually realistic waveforms.
We find that this approach to music enhancement outperforms baselines which use classical methods for mel-spectrogram inversion and an end-to-end approach directly mapping noisy waveforms to clean waveforms.
arXiv Detail & Related papers (2022-04-28T05:00:07Z) - Speech segmentation using multilevel hybrid filters [0.0]
A novel approach for speech segmentation is proposed, based on Multilevel Hybrid (mean/min) Filters (MHF)
The proposed method is based on spectral changes, with the goal of segmenting the voice into homogeneous acoustic segments.
This algorithm is being used for phoneticallysegmented speech coder, with successful results.
arXiv Detail & Related papers (2022-02-24T00:03:02Z) - WaveGrad: Estimating Gradients for Waveform Generation [55.405580817560754]
WaveGrad is a conditional model for waveform generation which estimates gradients of the data density.
It starts from a Gaussian white noise signal and iteratively refines the signal via a gradient-based sampler conditioned on the mel-spectrogram.
We find that it can generate high fidelity audio samples using as few as six iterations.
arXiv Detail & Related papers (2020-09-02T17:44:10Z) - Change Point Detection in Time Series Data using Autoencoders with a
Time-Invariant Representation [69.34035527763916]
Change point detection (CPD) aims to locate abrupt property changes in time series data.
Recent CPD methods demonstrated the potential of using deep learning techniques, but often lack the ability to identify more subtle changes in the autocorrelation statistics of the signal.
We employ an autoencoder-based methodology with a novel loss function, through which the used autoencoders learn a partially time-invariant representation that is tailored for CPD.
arXiv Detail & Related papers (2020-08-21T15:03:21Z) - Learning Noise-Aware Encoder-Decoder from Noisy Labels by Alternating
Back-Propagation for Saliency Detection [54.98042023365694]
We propose a noise-aware encoder-decoder framework to disentangle a clean saliency predictor from noisy training examples.
The proposed model consists of two sub-models parameterized by neural networks.
arXiv Detail & Related papers (2020-07-23T18:47:36Z) - Localized Spectral Graph Filter Frames: A Unifying Framework, Survey of
Design Considerations, and Numerical Comparison (Extended Cut) [1.52292571922932]
Representing data residing on a graph as a linear combination of building block signals can enable efficient and insightful visual or statistical analysis of the data.
We survey a particular class of dictionaries called localized spectral graph filter frames.
We emphasize computationally efficient methods that ensure the resulting transforms and their inverses can be applied to data residing on large, sparse graphs.
arXiv Detail & Related papers (2020-06-19T16:49:33Z) - Offline detection of change-points in the mean for stationary graph
signals [55.98760097296213]
We propose an offline method that relies on the concept of graph signal stationarity.
Our detector comes with a proof of a non-asymptotic inequality oracle.
arXiv Detail & Related papers (2020-06-18T15:51:38Z) - Transforming Spectrum and Prosody for Emotional Voice Conversion with
Non-Parallel Training Data [91.92456020841438]
Many studies require parallel speech data between different emotional patterns, which is not practical in real life.
We propose a CycleGAN network to find an optimal pseudo pair from non-parallel training data.
We also study the use of continuous wavelet transform (CWT) to decompose F0 into ten temporal scales, that describes speech prosody at different time resolution.
arXiv Detail & Related papers (2020-02-01T12:36:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.