Related papers: Accelerating Deep Learning Applications in Space

Accelerating Deep Learning Applications in Space

URL: http://arxiv.org/abs/2007.11089v1
Date: Tue, 21 Jul 2020 21:06:30 GMT
Title: Accelerating Deep Learning Applications in Space
Authors: Martina Lofqvist, Jos\'e Cano
Abstract summary: We investigate the performance of CNN-based object detectors on constrained devices. We take a closer look at the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN) The performance is measured in terms of inference time, memory consumption, and accuracy.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Computing at the edge offers intriguing possibilities for the development of autonomy and artificial intelligence. The advancements in autonomous technologies and the resurgence of computer vision have led to a rise in demand for fast and reliable deep learning applications. In recent years, the industry has introduced devices with impressive processing power to perform various object detection tasks. However, with real-time detection, devices are constrained in memory, computational capacity, and power, which may compromise the overall performance. This could be solved either by optimizing the object detector or modifying the images. In this paper, we investigate the performance of CNN-based object detectors on constrained devices when applying different image compression techniques. We examine the capabilities of a NVIDIA Jetson Nano; a low-power, high-performance computer, with an integrated GPU, small enough to fit on-board a CubeSat. We take a closer look at the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN) that are pre-trained on DOTA - a Large Scale Dataset for Object Detection in Aerial Images. The performance is measured in terms of inference time, memory consumption, and accuracy. By applying image compression techniques, we are able to optimize performance. The two techniques applied, lossless compression and image scaling, improves speed and memory consumption with no or little change in accuracy. The image scaling technique achieves a 100% runnable dataset and we suggest combining both techniques in order to optimize the speed/memory/accuracy trade-off.

Related papers

Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images. Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images. ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z)
Taming Lookup Tables for Efficient Image Retouching [30.48643578900116]
We propose ICELUT, which adopts LUTs for extremely efficient edge inference, without any convolutional neural network (CNN) ICELUT achieves near-state-of-the-art performance and remarkably low power consumption. These enable ICELUT, the first-ever purely LUT-based image enhancer, to reach an unprecedented speed of 0.4ms on GPU and 7ms on CPU, at least one order faster than any CNN solution.
arXiv Detail & Related papers (2024-03-28T08:49:35Z)
Random resistive memory-based deep extreme point learning machine for unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM) Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z)
Feature Compression for Rate Constrained Object Detection on the Edge [20.18227104333772]
An emerging approach to solve this problem is to offload the computation of neural networks to computing resources at an edge server. In this work, we consider a "split computation" system to offload a part of the computation of the YOLO object detection model. We train the feature compression and decompression module together with the YOLO model to optimize the object detection accuracy under a rate constraint.
arXiv Detail & Related papers (2022-04-15T03:39:30Z)
Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge TPU [58.720142291102135]
In this paper we propose a pose estimation software exploiting neural network architectures. We show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.
arXiv Detail & Related papers (2022-04-07T08:53:18Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
You Better Look Twice: a new perspective for designing accurate detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture. It reduces computations by separating objects from background using a very lite first-stage. Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z)
Optimizing Data Processing in Space for Object Detection in Satellite Imagery [0.0]
We investigate the performance of CNN-based object detectors on constrained devices by applying different image compression techniques to satellite data. We take a closer look at object detection networks, including the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN) models. The results show that by applying image compression techniques, we are able to improve the execution time and memory consumption, achieving a fully runnable dataset.
arXiv Detail & Related papers (2021-07-08T11:37:24Z)
Analysis of voxel-based 3D object detection methods efficiency for real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper. Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances. Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z)
A Framework for Fast Scalable BNN Inference using Googlenet and Transfer Learning [0.0]
This thesis aims to achieve high accuracy in object detection with good real-time performance. The binarized neural network has shown high performance in various vision tasks such as image classification, object detection, and semantic segmentation. Results show that the accuracy of objects detected by the transfer learning method is more when compared to the existing methods.
arXiv Detail & Related papers (2021-01-04T06:16:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.