A Systematic Approach to Robustness Modelling for Deep Convolutional
Neural Networks
- URL: http://arxiv.org/abs/2401.13751v1
- Date: Wed, 24 Jan 2024 19:12:37 GMT
- Title: A Systematic Approach to Robustness Modelling for Deep Convolutional
Neural Networks
- Authors: Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy L\"ofstedt, Erik
Elmroth
- Abstract summary: Recent work raises questions about the ability for even larger models to generalize to data outside of the controlled train and test sets.
We provide a method that uses induced failures to model the probability of failure as a function of time.
We examine the various trade-offs between cost, robustness, latency, and reliability to find that larger models do not significantly aid in adversarial robustness.
- Score: 0.294944680995069
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Convolutional neural networks have shown to be widely applicable to a large
number of fields when large amounts of labelled data are available. The recent
trend has been to use models with increasingly larger sets of tunable
parameters to increase model accuracy, reduce model loss, or create more
adversarially robust models -- goals that are often at odds with one another.
In particular, recent theoretical work raises questions about the ability for
even larger models to generalize to data outside of the controlled train and
test sets. As such, we examine the role of the number of hidden layers in the
ResNet model, demonstrated on the MNIST, CIFAR10, CIFAR100 datasets. We test a
variety of parameters including the size of the model, the floating point
precision, and the noise level of both the training data and the model output.
To encapsulate the model's predictive power and computational cost, we provide
a method that uses induced failures to model the probability of failure as a
function of time and relate that to a novel metric that allows us to quickly
determine whether or not the cost of training a model outweighs the cost of
attacking it. Using this approach, we are able to approximate the expected
failure rate using a small number of specially crafted samples rather than
increasingly larger benchmark datasets. We demonstrate the efficacy of this
technique on both the MNIST and CIFAR10 datasets using 8-, 16-, 32-, and 64-bit
floating-point numbers, various data pre-processing techniques, and several
attacks on five configurations of the ResNet model. Then, using empirical
measurements, we examine the various trade-offs between cost, robustness,
latency, and reliability to find that larger models do not significantly aid in
adversarial robustness despite costing significantly more to train.
Related papers
- Identifying and Mitigating Model Failures through Few-shot CLIP-aided
Diffusion Generation [65.268245109828]
We propose an end-to-end framework to generate text descriptions of failure modes associated with spurious correlations.
These descriptions can be used to generate synthetic data using generative models, such as diffusion models.
Our experiments have shown remarkable textbfimprovements in accuracy ($sim textbf21%$) on hard sub-populations.
arXiv Detail & Related papers (2023-12-09T04:43:49Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - HigeNet: A Highly Efficient Modeling for Long Sequence Time Series
Prediction in AIOps [30.963758935255075]
In this paper, we propose a highly efficient model named HigeNet to predict the long-time sequence time series.
We show that training time, resource usage and accuracy of the model are found to be significantly better than five state-of-the-art competing models.
arXiv Detail & Related papers (2022-11-13T13:48:43Z) - Neural forecasting at scale [8.245069318446415]
We study the problem of efficiently scaling ensemble-based deep neural networks for time series (TS) forecasting on a large set of time series.
Our model addresses the practical limitations of related models, reducing the training time by half and memory requirement by a factor of 5.
arXiv Detail & Related papers (2021-09-20T17:22:40Z) - Investigating the Relationship Between Dropout Regularization and Model
Complexity in Neural Networks [0.0]
Dropout Regularization serves to reduce variance in Deep Learning models.
We explore the relationship between the dropout rate and model complexity by training 2,000 neural networks.
We build neural networks that predict the optimal dropout rate given the number of hidden units in each dense layer.
arXiv Detail & Related papers (2021-08-14T23:49:33Z) - Model-based micro-data reinforcement learning: what are the crucial
model properties and which model to choose? [0.2836066255205732]
We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models.
We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin.
We also found that deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts.
arXiv Detail & Related papers (2021-07-24T11:38:25Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Balancing Accuracy and Latency in Multipath Neural Networks [0.09668407688201358]
We use a one-shot neural architecture search model to implicitly evaluate the performance of an intractable number of neural networks.
We show that our method can accurately model the relative performance between models with different latencies and predict the performance of unseen models with good precision across different datasets.
arXiv Detail & Related papers (2021-04-25T00:05:48Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.