Partial Weight Adaptation for Robust DNN Inference
- URL: http://arxiv.org/abs/2003.06131v1
- Date: Fri, 13 Mar 2020 06:25:45 GMT
- Title: Partial Weight Adaptation for Robust DNN Inference
- Authors: Xiufeng Xie, Kyu-Han Kim
- Abstract summary: We present GearNN, an adaptive inference architecture that accommodates heterogeneous inputs.
GearNN improves the accuracy (mIoU) by an average of 18.12% over a DNN trained with the undistorted dataset and 4.84% over stability training from Google.
- Score: 9.301756947410773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mainstream video analytics uses a pre-trained DNN model with an assumption
that inference input and training data follow the same probability
distribution. However, this assumption does not always hold in the wild:
autonomous vehicles may capture video with varying brightness; unstable
wireless bandwidth calls for adaptive bitrate streaming of video; and,
inference servers may serve inputs from heterogeneous IoT devices/cameras. In
such situations, the level of input distortion changes rapidly, thus reshaping
the probability distribution of the input.
We present GearNN, an adaptive inference architecture that accommodates
heterogeneous DNN inputs. GearNN employs an optimization algorithm to identify
a small set of "distortion-sensitive" DNN parameters, given a memory budget.
Based on the distortion level of the input, GearNN then adapts only the
distortion-sensitive parameters, while reusing the rest of constant parameters
across all input qualities. In our evaluation of DNN inference with dynamic
input distortions, GearNN improves the accuracy (mIoU) by an average of 18.12%
over a DNN trained with the undistorted dataset and 4.84% over stability
training from Google, with only 1.8% extra memory overhead.
Related papers
- OneAdapt: Fast Configuration Adaptation for Video Analytics Applications via Backpropagation [15.112437830075947]
Deep learning inference on streaming media data, such as object detection in video or LiDAR feeds, is now ubiquitous.
These applications typically require significant network bandwidth to gather high-fidelity data and extensive GPU resources to run deep neural networks (DNNs)
This paper presents OneAdapt, which meets three requirements simultaneously: adapt configurations with minimum extra GPU or bandwidth overhead; reach near-optimal decisions based on how the data affects the final DNN's accuracy; and do so for a range of configuration knobs.
arXiv Detail & Related papers (2023-10-03T20:36:03Z) - Knowing When to Stop: Delay-Adaptive Spiking Neural Network Classifiers with Reliability Guarantees [36.14499894307206]
Spiking neural networks (SNNs) process time-series data via internal event-driven neural dynamics.
We introduce a novel delay-adaptive SNN-based inference methodology that provides guaranteed reliability for the decisions produced at input-dependent stopping times.
arXiv Detail & Related papers (2023-05-18T22:11:04Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate
Compression for Split DNN Computing [5.3221129103999125]
Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads.
We present an approach that addresses the challenge of optimizing the rate-accuracy-complexity trade-off.
Our approach is remarkably lightweight, both during training and inference, highly effective and achieves excellent rate-distortion performance.
arXiv Detail & Related papers (2022-08-24T15:02:11Z) - Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on
Edge Devices [19.335535517714703]
The prediction accuracy of the deep neural networks (DNNs) after deployment at the edge can suffer with time due to shifts in the distribution of the new data.
Recent prediction-time unsupervised DNN adaptation techniques have been introduced that improve prediction accuracy of the models for noisy data by re-tuning the batch normalization parameters.
This paper, for the first time, performs a comprehensive measurement study of such techniques to quantify their performance and energy on various edge devices.
arXiv Detail & Related papers (2022-03-21T19:10:40Z) - Shift-Robust GNNs: Overcoming the Limitations of Localized Graph
Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution.
We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z) - Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural
Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF)
The proposed SIDNN is compatible with a broad range of OPF schemes.
It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z) - TaxoNN: A Light-Weight Accelerator for Deep Neural Network Training [2.5025363034899732]
We present a novel approach to add the training ability to a baseline DNN accelerator (inference only) by splitting the SGD algorithm into simple computational elements.
Based on this approach we propose TaxoNN, a light-weight accelerator for DNN training.
Our experimental results show that TaxoNN delivers, on average, 0.97% higher misclassification rate compared to a full-precision implementation.
arXiv Detail & Related papers (2020-10-11T09:04:19Z) - DIET-SNN: Direct Input Encoding With Leakage and Threshold Optimization
in Deep Spiking Neural Networks [8.746046482977434]
DIET-SNN is a low-deep spiking network that is trained with gradient descent to optimize the membrane leak and the firing threshold.
We evaluate DIET-SNN on image classification tasks from CIFAR and ImageNet datasets on VGG and ResNet architectures.
We achieve top-1 accuracy of 69% with 5 timesteps (inference latency) on the ImageNet dataset with 12x less compute energy than an equivalent standard ANN.
arXiv Detail & Related papers (2020-08-09T05:07:17Z) - Bayesian Graph Neural Networks with Adaptive Connection Sampling [62.51689735630133]
We propose a unified framework for adaptive connection sampling in graph neural networks (GNNs)
The proposed framework not only alleviates over-smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs.
arXiv Detail & Related papers (2020-06-07T07:06:35Z) - GraN: An Efficient Gradient-Norm Based Detector for Adversarial and
Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations.
GraN is a time- and parameter-efficient method that is easily adaptable to any DNN.
GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.