Batching for Green AI -- An Exploratory Study on Inference
- URL: http://arxiv.org/abs/2307.11434v1
- Date: Fri, 21 Jul 2023 08:55:23 GMT
- Title: Batching for Green AI -- An Exploratory Study on Inference
- Authors: Tim Yarally, Lu\'is Cruz, Daniel Feitosa, June Sallou, Arie van
Deursen
- Abstract summary: We examine the effect of input on the energy consumption and response times of five fully-trained neural networks.
We find that in general energy consumption rises at a much steeper pace than accuracy and question the necessity of this evolution.
- Score: 8.025202812165412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The batch size is an essential parameter to tune during the development of
new neural networks. Amongst other quality indicators, it has a large degree of
influence on the model's accuracy, generalisability, training times and
parallelisability. This fact is generally known and commonly studied. However,
during the application phase of a deep learning model, when the model is
utilised by an end-user for inference, we find that there is a disregard for
the potential benefits of introducing a batch size. In this study, we examine
the effect of input batching on the energy consumption and response times of
five fully-trained neural networks for computer vision that were considered
state-of-the-art at the time of their publication. The results suggest that
batching has a significant effect on both of these metrics. Furthermore, we
present a timeline of the energy efficiency and accuracy of neural networks
over the past decade. We find that in general, energy consumption rises at a
much steeper pace than accuracy and question the necessity of this evolution.
Additionally, we highlight one particular network, ShuffleNetV2(2018), that
achieved a competitive performance for its time while maintaining a much lower
energy consumption. Nevertheless, we highlight that the results are model
dependent.
Related papers
- The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer Vision [4.45482419850721]
Researchers have recently demonstrated that attackers can compute and submit so-called sponge examples at inference time to increase the energy consumption and decision latency of neural networks.
In computer vision, the proposed strategy crafts inputs with less activation sparsity which could otherwise be used to accelerate the computation.
A uniform image, that is, an image with mostly flat, uniformly colored surfaces, triggers more activations due to a specific interplay of convolution, batch normalization, and ReLU activation.
arXiv Detail & Related papers (2024-03-27T14:11:23Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - Uncovering Energy-Efficient Practices in Deep Learning Training:
Preliminary Steps Towards Green AI [8.025202812165412]
We consider energy consumption as a metric of equal importance to accuracy and to reduce any irrelevant tasks or energy usage.
We examine the training stage of the deep learning pipeline from a sustainability perspective.
We highlight innovative and promising energy-efficient practices for training deep learning models.
arXiv Detail & Related papers (2023-03-24T12:48:21Z) - Tree-Based Learning in RNNs for Power Consumption Forecasting [0.4822598110892847]
A Recurrent Neural Network that operates on several time lags, called an RNN(p), is the natural generalization of an Autoregressive ARX(p) model.
We prove that, when training RNN(p) models, other learning algorithms turn out to be much more efficient in terms of both time and space complexity.
We present an application of RNN(p) models for power consumption forecasting on the hourly scale.
arXiv Detail & Related papers (2022-09-03T09:21:39Z) - SATBench: Benchmarking the speed-accuracy tradeoff in object recognition
by humans and dynamic neural networks [0.45438205344305216]
People show a flexible tradeoff between speed and accuracy.
We present the first large-scale dataset of the speed-accuracy tradeoff (SAT) in recognizing ImageNet images.
We compare networks with humans on curve-fit error, category-wise correlation, and curve steepness.
arXiv Detail & Related papers (2022-06-16T20:03:31Z) - How Tempering Fixes Data Augmentation in Bayesian Neural Networks [22.188535244056016]
We show that tempering implicitly reduces the misspecification arising from modeling augmentations as i.i.d. data.
The temperature mimics the role of the effective sample size, reflecting the gain in information provided by the augmentations.
arXiv Detail & Related papers (2022-05-27T11:06:56Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z) - Compute and Energy Consumption Trends in Deep Learning Inference [67.32875669386488]
We study relevant models in the areas of computer vision and natural language processing.
For a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated.
arXiv Detail & Related papers (2021-09-12T09:40:18Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.