Face: Fast, Accurate and Context-Aware Audio Annotation and
Classification
- URL: http://arxiv.org/abs/2303.03666v1
- Date: Tue, 7 Mar 2023 06:04:58 GMT
- Title: Face: Fast, Accurate and Context-Aware Audio Annotation and
Classification
- Authors: M. Mehrdad Morsali, Hoda Mohammadzade, Saeed Bagheri Shouraki
- Abstract summary: This paper presents a context-aware framework for feature selection and classification procedures to realize a fast and accurate audio event annotation and classification.
The exploration for feature selection also embraces an investigation of audio Tempo representation.
Our proposed algorithm for sound classification obtained average prediction accuracy of 98.05% on the UrbanSound8K dataset.
- Score: 1.4610038284393165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a context-aware framework for feature selection and
classification procedures to realize a fast and accurate audio event annotation
and classification. The context-aware design starts with exploring feature
extraction techniques to find an appropriate combination to select a set
resulting in remarkable classification accuracy with minimal computational
effort. The exploration for feature selection also embraces an investigation of
audio Tempo representation, an advantageous feature extraction method missed by
previous works in the environmental audio classification research scope. The
proposed annotation method considers outlier, inlier, and hard-to-predict data
samples to realize context-aware Active Learning, leading to the average
accuracy of 90% when only 15% of data possess initial annotation. Our proposed
algorithm for sound classification obtained average prediction accuracy of
98.05% on the UrbanSound8K dataset. The notebooks containing our source codes
and implementation results are available at https://github.com/gitmehrdad/FACE.
Related papers
- Heterogeneous sound classification with the Broad Sound Taxonomy and Dataset [6.91815289914328]
This paper explores methodologies for automatically classifying heterogeneous sounds characterized by high intra-class variability.
We construct a dataset through manual annotation to ensure accuracy, diverse representation within each class and relevance in real-world scenarios.
Experimental results illustrate that audio embeddings encoding acoustic and semantic information achieve higher accuracy in the classification task.
arXiv Detail & Related papers (2024-10-01T18:09:02Z) - Prioritizing Informative Features and Examples for Deep Learning from Noisy Data [4.741012804505562]
We propose a systemic framework that prioritizes informative features and examples to enhance each stage of the development process.
We first propose an approach to extract only informative features that are inherent to solving a target task by using auxiliary out-of-distribution data.
Next, we introduce an approach that prioritizes informative examples from unlabeled noisy data in order to reduce the labeling cost of active learning.
arXiv Detail & Related papers (2024-02-27T07:15:35Z) - Combating Label Noise With A General Surrogate Model For Sample Selection [77.45468386115306]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - Fast Classification with Sequential Feature Selection in Test Phase [1.1470070927586016]
This paper introduces a novel approach to active feature acquisition for classification.
It is the task of sequentially selecting the most informative subset of features to achieve optimal prediction performance.
The proposed approach involves a new lazy model that is significantly faster and more efficient compared to existing methods.
arXiv Detail & Related papers (2023-06-25T21:31:46Z) - Continual Learning For On-Device Environmental Sound Classification [63.81276321857279]
We propose a simple and efficient continual learning method for on-device environmental sound classification.
Our method selects the historical data for the training by measuring the per-sample classification uncertainty.
arXiv Detail & Related papers (2022-07-15T12:13:04Z) - UNICON: Combating Label Noise Through Uniform Selection and Contrastive
Learning [89.56465237941013]
We propose UNICON, a simple yet effective sample selection method which is robust to high label noise.
We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
arXiv Detail & Related papers (2022-03-28T07:36:36Z) - An Efficient and Accurate Rough Set for Feature Selection,
Classification and Knowledge Representation [89.5951484413208]
This paper present a strong data mining method based on rough set, which can realize feature selection, classification and knowledge representation at the same time.
We first find the ineffectiveness of rough set because of overfitting, especially in processing noise attribute, and propose a robust measurement for an attribute, called relative importance.
Experimental results on public benchmark data sets show that the proposed framework achieves higher accurcy than seven popular or the state-of-the-art feature selection methods.
arXiv Detail & Related papers (2021-12-29T12:45:49Z) - Optimizing Speech Emotion Recognition using Manta-Ray Based Feature
Selection [1.4502611532302039]
We show that concatenation of features, extracted by using different existing feature extraction methods can boost the classification accuracy.
We also perform a novel application of Manta Ray optimization in speech emotion recognition tasks that resulted in a state-of-the-art result.
arXiv Detail & Related papers (2020-09-18T16:09:34Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z) - Active Learning for Sound Event Detection [18.750572243562576]
This paper proposes an active learning system for sound event detection (SED)
It aims at maximizing the accuracy of a learned SED model with limited annotation effort.
Remarkably, the required annotation effort can be greatly reduced on the dataset where target sound events are rare.
arXiv Detail & Related papers (2020-02-12T14:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.