From Sound Representation to Model Robustness
- URL: http://arxiv.org/abs/2007.13703v3
- Date: Mon, 18 Jan 2021 03:24:27 GMT
- Title: From Sound Representation to Model Robustness
- Authors: Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich
- Abstract summary: We investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
Averaged over various experiments on three environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures.
- Score: 82.21746840893658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the impact of different standard environmental
sound representations (spectrograms) on the recognition performance and
adversarial attack robustness of a victim residual convolutional neural
network. Averaged over various experiments on three benchmarking environmental
sound datasets, we found the ResNet-18 model outperforms other deep learning
architectures such as GoogLeNet and AlexNet both in terms of classification
accuracy and the number of training parameters. Therefore we set this model as
our front-end classifier for subsequent investigations. Herein, we measure the
impact of different settings required for generating more informative
mel-frequency cepstral coefficient (MFCC), short-time Fourier transform (STFT),
and discrete wavelet transform (DWT) representations on our front-end model.
This measurement involves comparing the classification performance over the
adversarial robustness. On the balance of average budgets allocated by
adversary and the cost of attack, we demonstrate an inverse relationship
between recognition accuracy and model robustness against six attack
algorithms. Moreover, our experimental results show that while the ResNet-18
model trained on DWT spectrograms achieves the highest recognition accuracy,
attacking this model is relatively more costly for the adversary compared to
other 2D representations.
Related papers
- Exploring the Physical World Adversarial Robustness of Vehicle Detection [13.588120545886229]
Adrial attacks can compromise the robustness of real-world detection models.
We propose an innovative instant-level data generation pipeline using the CARLA simulator.
Our findings highlight diverse model performances under adversarial conditions.
arXiv Detail & Related papers (2023-08-07T11:09:12Z) - Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo
Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance.
Current methods with a fixed model do not work uniformly well across various datasets.
This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z) - GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models.
We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench.
GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z) - From Environmental Sound Representation to Robustness of 2D CNN Models
Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Adversarially Training for Audio Classifiers [9.868221447090853]
We show that, the ResNet-56 model trained on the 2D representation of the discrete wavelet transform with the tonnetz chromagram outperforms other models in terms of recognition accuracy.
We run our experiments on two benchmarking environmental sound datasets and show that without any imposed limitations on the budget allocations for the adversary, the fooling rate of the adversarially trained models can exceed 90%.
arXiv Detail & Related papers (2020-08-26T15:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.