Hardware-aware mobile building block evaluation for computer vision
- URL: http://arxiv.org/abs/2208.12694v1
- Date: Fri, 26 Aug 2022 14:44:17 GMT
- Title: Hardware-aware mobile building block evaluation for computer vision
- Authors: Maxim Bonnaerens, Matthias Freiberger, Marian Verhelst, Joni Dambre
- Abstract summary: We show that our approach allows to match the information obtained by previous comparison paradigms.
We show that choosing the right building block can speed up inference by up to a factor of 2x on specific hardware ML accelerators.
- Score: 9.494760855699596
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we propose a methodology to accurately evaluate and compare the
performance of efficient neural network building blocks for computer vision in
a hardware-aware manner. Our comparison uses pareto fronts based on randomly
sampled networks from a design space to capture the underlying
accuracy/complexity trade-offs. We show that our approach allows to match the
information obtained by previous comparison paradigms, but provides more
insights in the relationship between hardware cost and accuracy. We use our
methodology to analyze different building blocks and evaluate their performance
on a range of embedded hardware platforms. This highlights the importance of
benchmarking building blocks as a preselection step in the design process of a
neural network. We show that choosing the right building block can speed up
inference by up to a factor of 2x on specific hardware ML accelerators.
Related papers
- AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation [48.82264764771652]
We introduce AsCAN -- a hybrid architecture, combining both convolutional and transformer blocks.
AsCAN supports a variety of tasks: recognition, segmentation, class-conditional image generation.
We then scale the same architecture to solve a large-scale text-to-image task and show state-of-the-art performance.
arXiv Detail & Related papers (2024-11-07T18:43:17Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Multi-objective Differentiable Neural Architecture Search [58.67218773054753]
We propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics.
Our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets.
arXiv Detail & Related papers (2024-02-28T10:09:04Z) - Using the Abstract Computer Architecture Description Language to Model
AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements.
The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams.
In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z) - Hardware Aware Evolutionary Neural Architecture Search using
Representation Similarity Metric [12.52012450501367]
Hardware-aware Neural Architecture Search (HW-NAS) is a technique used to automatically design the architecture of a neural network for a specific task and target hardware.
evaluating the performance of candidate architectures is a key challenge in HW-NAS, as it requires significant computational resources.
We propose an efficient hardware-aware evolution-based NAS approach called HW-EvRSNAS.
arXiv Detail & Related papers (2023-11-07T11:58:40Z) - Quantization of Deep Neural Networks to facilitate self-correction of
weights on Phase Change Memory-based analog hardware [0.0]
We develop an algorithm to approximate a set of multiplicative weights.
These weights aim to represent the original network's weights with minimal loss in performance.
Our results demonstrate that, when paired with an on-chip pulse generator, our self-correcting neural network performs comparably to those trained with analog-aware algorithms.
arXiv Detail & Related papers (2023-09-30T10:47:25Z) - GRANITE: A Graph Neural Network Model for Basic Block Throughput
Estimation [3.739243122393041]
We introduce a new machine learning model that estimates throughput of basic blocks across different microarchitectures.
Results establish a new state-of-the-art for basic block performance estimation with an average test error of 6.9%.
We propose the use of multi-task learning with independent multi-layer feed forward decoder networks.
arXiv Detail & Related papers (2022-10-08T03:03:49Z) - MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process.
We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z) - DANCE: Differentiable Accelerator/Network Co-Exploration [8.540518473228078]
This work presents a differentiable approach towards the co-exploration of the hardware accelerator and network architecture design.
By modeling the hardware evaluation software with a neural network, the relation between the accelerator architecture and the hardware metrics becomes differentiable.
Compared to the naive existing approaches, our method performs co-exploration in a significantly shorter time, while achieving superior accuracy and hardware cost metrics.
arXiv Detail & Related papers (2020-09-14T07:43:27Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.