Accelerating Deep Learning Applications in Space
- URL: http://arxiv.org/abs/2007.11089v1
- Date: Tue, 21 Jul 2020 21:06:30 GMT
- Title: Accelerating Deep Learning Applications in Space
- Authors: Martina Lofqvist, Jos\'e Cano
- Abstract summary: We investigate the performance of CNN-based object detectors on constrained devices.
We take a closer look at the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN)
The performance is measured in terms of inference time, memory consumption, and accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Computing at the edge offers intriguing possibilities for the development of
autonomy and artificial intelligence. The advancements in autonomous
technologies and the resurgence of computer vision have led to a rise in demand
for fast and reliable deep learning applications. In recent years, the industry
has introduced devices with impressive processing power to perform various
object detection tasks. However, with real-time detection, devices are
constrained in memory, computational capacity, and power, which may compromise
the overall performance. This could be solved either by optimizing the object
detector or modifying the images. In this paper, we investigate the performance
of CNN-based object detectors on constrained devices when applying different
image compression techniques. We examine the capabilities of a NVIDIA Jetson
Nano; a low-power, high-performance computer, with an integrated GPU, small
enough to fit on-board a CubeSat. We take a closer look at the Single Shot
MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN)
that are pre-trained on DOTA - a Large Scale Dataset for Object Detection in
Aerial Images. The performance is measured in terms of inference time, memory
consumption, and accuracy. By applying image compression techniques, we are
able to optimize performance. The two techniques applied, lossless compression
and image scaling, improves speed and memory consumption with no or little
change in accuracy. The image scaling technique achieves a 100% runnable
dataset and we suggest combining both techniques in order to optimize the
speed/memory/accuracy trade-off.
Related papers
- Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images.
Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images.
ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z) - Taming Lookup Tables for Efficient Image Retouching [30.48643578900116]
We propose ICELUT, which adopts LUTs for extremely efficient edge inference, without any convolutional neural network (CNN)
ICELUT achieves near-state-of-the-art performance and remarkably low power consumption.
These enable ICELUT, the first-ever purely LUT-based image enhancer, to reach an unprecedented speed of 0.4ms on GPU and 7ms on CPU, at least one order faster than any CNN solution.
arXiv Detail & Related papers (2024-03-28T08:49:35Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Feature Compression for Rate Constrained Object Detection on the Edge [20.18227104333772]
An emerging approach to solve this problem is to offload the computation of neural networks to computing resources at an edge server.
In this work, we consider a "split computation" system to offload a part of the computation of the YOLO object detection model.
We train the feature compression and decompression module together with the YOLO model to optimize the object detection accuracy under a rate constraint.
arXiv Detail & Related papers (2022-04-15T03:39:30Z) - Deep Learning for Real Time Satellite Pose Estimation on Low Power Edge
TPU [58.720142291102135]
In this paper we propose a pose estimation software exploiting neural network architectures.
We show how low power machine learning accelerators could enable Artificial Intelligence exploitation in space.
arXiv Detail & Related papers (2022-04-07T08:53:18Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - Optimizing Data Processing in Space for Object Detection in Satellite
Imagery [0.0]
We investigate the performance of CNN-based object detectors on constrained devices by applying different image compression techniques to satellite data.
We take a closer look at object detection networks, including the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN) models.
The results show that by applying image compression techniques, we are able to improve the execution time and memory consumption, achieving a fully runnable dataset.
arXiv Detail & Related papers (2021-07-08T11:37:24Z) - Analysis of voxel-based 3D object detection methods efficiency for
real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper.
Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances.
Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z) - A Framework for Fast Scalable BNN Inference using Googlenet and Transfer
Learning [0.0]
This thesis aims to achieve high accuracy in object detection with good real-time performance.
The binarized neural network has shown high performance in various vision tasks such as image classification, object detection, and semantic segmentation.
Results show that the accuracy of objects detected by the transfer learning method is more when compared to the existing methods.
arXiv Detail & Related papers (2021-01-04T06:16:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.