Surgical Mask Detection with Convolutional Neural Networks and Data
Augmentations on Spectrograms
- URL: http://arxiv.org/abs/2008.04590v1
- Date: Tue, 11 Aug 2020 09:02:47 GMT
- Title: Surgical Mask Detection with Convolutional Neural Networks and Data
Augmentations on Spectrograms
- Authors: Steffen Illium, Robert M\"uller, Andreas Sedlmeier and Claudia
Linnhoff-Popien
- Abstract summary: We show the impact of data augmentation on the binary classification task of surgical mask detection in samples of human voice.
Results show that most of the baselines given by ComParE are outperformed.
- Score: 8.747840760772268
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many fields of research, labeled datasets are hard to acquire. This is
where data augmentation promises to overcome the lack of training data in the
context of neural network engineering and classification tasks. The idea here
is to reduce model over-fitting to the feature distribution of a small
under-descriptive training dataset. We try to evaluate such data augmentation
techniques to gather insights in the performance boost they provide for several
convolutional neural networks on mel-spectrogram representations of audio data.
We show the impact of data augmentation on the binary classification task of
surgical mask detection in samples of human voice (ComParE Challenge 2020).
Also we consider four varying architectures to account for augmentation
robustness. Results show that most of the baselines given by ComParE are
outperformed.
Related papers
- Data Augmentations in Deep Weight Spaces [89.45272760013928]
We introduce a novel augmentation scheme based on the Mixup method.
We evaluate the performance of these techniques on existing benchmarks as well as new benchmarks we generate.
arXiv Detail & Related papers (2023-11-15T10:43:13Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - MAGDiff: Covariate Data Set Shift Detection via Activation Graphs of Deep Neural Networks [8.887179103071388]
We propose a new family of representations, called MAGDiff, that we extract from any given neural network classifier.
These representations are computed by comparing the activation graphs of the neural network for samples belonging to the training distribution and to the target distribution.
We show that our novel representations induce significant improvements over a state-of-the-art baseline relying on the network output.
arXiv Detail & Related papers (2023-05-22T17:34:47Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - Image Classification on Small Datasets via Masked Feature Mixing [22.105356244579745]
A proposed architecture called ChimeraMix learns a data augmentation by generating compositions of instances.
The generative model encodes images in pairs, combines the features guided by a mask, and creates new samples.
For evaluation, all methods are trained from scratch without any additional data.
arXiv Detail & Related papers (2022-02-23T16:51:22Z) - Ensemble Augmentation for Deep Neural Networks Using 1-D Time Series
Vibration Data [0.0]
Time-series data are one of the fundamental types of raw data representation used in data-driven techniques.
Deep Neural Networks (DNNs) require huge labeled training samples to reach their optimum performance.
In this study, a data augmentation technique named ensemble augmentation is proposed to overcome this limitation.
arXiv Detail & Related papers (2021-08-06T20:04:29Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Reducing Labelled Data Requirement for Pneumonia Segmentation using
Image Augmentations [0.0]
We investigate the effect of image augmentations on reducing the requirement of labelled data in semantic segmentation of chest X-rays for pneumonia detection.
We train fully convolutional network models on subsets of different sizes from the total training data.
We find that rotate and mixup are the best augmentations amongst rotate, mixup, translate, gamma and horizontal flip, wherein they reduce the labelled data requirement by 70%.
arXiv Detail & Related papers (2021-02-25T10:11:30Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Data augmentation using generative networks to identify dementia [20.137419355252362]
We show that generative models can be used as an effective approach for data augmentation.
In this paper, we investigate the application of a similar approach to different types of speech and audio-based features extracted from our automatic dementia detection system.
arXiv Detail & Related papers (2020-04-13T15:05:24Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.