Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for
Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge
- URL: http://arxiv.org/abs/2304.11196v1
- Date: Fri, 21 Apr 2023 18:07:14 GMT
- Title: Fast GraspNeXt: A Fast Self-Attention Neural Network Architecture for
Multi-task Learning in Computer Vision Tasks for Robotic Grasping on the Edge
- Authors: Alexander Wong, Yifan Wu, Saad Abbasi, Saeejith Nair, Yuhao Chen,
Mohammad Javad Shafiee
- Abstract summary: High architectural and computational complexity can result in poor suitability for deployment on embedded devices.
Fast GraspNeXt is a fast self-attention neural network architecture tailored for embedded multi-task learning in computer vision tasks for robotic grasping.
- Score: 80.88063189896718
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-task learning has shown considerable promise for improving the
performance of deep learning-driven vision systems for the purpose of robotic
grasping. However, high architectural and computational complexity can result
in poor suitability for deployment on embedded devices that are typically
leveraged in robotic arms for real-world manufacturing and warehouse
environments. As such, the design of highly efficient multi-task deep neural
network architectures tailored for computer vision tasks for robotic grasping
on the edge is highly desired for widespread adoption in manufacturing
environments. Motivated by this, we propose Fast GraspNeXt, a fast
self-attention neural network architecture tailored for embedded multi-task
learning in computer vision tasks for robotic grasping. To build Fast
GraspNeXt, we leverage a generative network architecture search strategy with a
set of architectural constraints customized to achieve a strong balance between
multi-task learning performance and embedded inference efficiency. Experimental
results on the MetaGraspNet benchmark dataset show that the Fast GraspNeXt
network design achieves the highest performance (average precision (AP),
accuracy, and mean squared error (MSE)) across multiple computer vision tasks
when compared to other efficient multi-task network architecture designs, while
having only 17.8M parameters (about >5x smaller), 259 GFLOPs (as much as >5x
lower) and as much as >3.15x faster on a NVIDIA Jetson TX2 embedded processor.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - DONNAv2 -- Lightweight Neural Architecture Search for Vision tasks [6.628409795264665]
We present the next-generation neural architecture design for computationally efficient neural architecture distillation - DONNAv2.
DONNAv2 reduces the computational cost of DONNA by 10x for the larger datasets.
To improve the quality of NAS search space, DONNAv2 leverages a block knowledge distillation filter to remove blocks with high inference costs.
arXiv Detail & Related papers (2023-09-26T04:48:50Z) - Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture
with Task-level Sparsity via Mixture-of-Experts [60.1586169973792]
M$3$ViT is the latest multi-task ViT model that introduces mixture-of-experts (MoE)
MoE achieves better accuracy and over 80% reduction computation but leaves challenges for efficient deployment on FPGA.
Our work, dubbed Edge-MoE, solves the challenges to introduce the first end-to-end FPGA accelerator for multi-task ViT with a collection of architectural innovations.
arXiv Detail & Related papers (2023-05-30T02:24:03Z) - Design of Convolutional Extreme Learning Machines for Vision-Based
Navigation Around Small Bodies [0.0]
Deep learning architectures such as convolutional neural networks are the standard in computer vision for image processing tasks.
Their accuracy however often comes at the cost of long and computationally expensive training.
A different method known as convolutional extreme learning machine has shown the potential to perform equally with a dramatic decrease in training time.
arXiv Detail & Related papers (2022-10-28T16:24:21Z) - Faster Attention Is What You Need: A Fast Self-Attention Neural Network
Backbone Architecture for the Edge via Double-Condensing Attention Condensers [71.40595908386477]
We introduce a new faster attention condenser design called double-condensing attention condensers.
The resulting backbone (which we name AttendNeXt) achieves significantly higher inference throughput on an embedded ARM processor.
These promising results demonstrate that exploring different efficient architecture designs and self-attention mechanisms can lead to interesting new building blocks for TinyML applications.
arXiv Detail & Related papers (2022-08-15T02:47:33Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - ISyNet: Convolutional Neural Networks design for AI accelerator [0.0]
Current state-of-the-art architectures are found with neural architecture search (NAS) taking model complexity into account.
We propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method.
We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.
arXiv Detail & Related papers (2021-09-04T20:57:05Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.