An FPGA smart camera implementation of segmentation models for drone
wildfire imagery
- URL: http://arxiv.org/abs/2309.01318v1
- Date: Mon, 4 Sep 2023 02:30:14 GMT
- Title: An FPGA smart camera implementation of segmentation models for drone
wildfire imagery
- Authors: Eduardo Guardu\~no-Martinez and Jorge Ciprian-Sanchez and Gerardo
Valente and Vazquez-Garcia and Gerardo Rodriguez-Hernandez and Adriana
Palacios-Rosas and Lucile Rossi-Tisson and Gilberto Ochoa-Ruiz
- Abstract summary: Wildfires represent one of the most relevant natural disasters worldwide, due to their impact on various societal and environmental levels.
One of the most promising approaches for wildfire fighting is the use of drones equipped with visible and infrared cameras for the detection, monitoring, and fire spread assessment in a remote manner but in close proximity to the affected areas.
In this work, we posit that smart cameras based on low-power consumption field-programmable gate arrays (FPGAs) and binarized neural networks (BNNs) represent a cost-effective alternative for implementing onboard computing on the edge.
- Score: 0.9837190842240352
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wildfires represent one of the most relevant natural disasters worldwide, due
to their impact on various societal and environmental levels. Thus, a
significant amount of research has been carried out to investigate and apply
computer vision techniques to address this problem. One of the most promising
approaches for wildfire fighting is the use of drones equipped with visible and
infrared cameras for the detection, monitoring, and fire spread assessment in a
remote manner but in close proximity to the affected areas. However,
implementing effective computer vision algorithms on board is often prohibitive
since deploying full-precision deep learning models running on GPU is not a
viable option, due to their high power consumption and the limited payload a
drone can handle. Thus, in this work, we posit that smart cameras, based on
low-power consumption field-programmable gate arrays (FPGAs), in tandem with
binarized neural networks (BNNs), represent a cost-effective alternative for
implementing onboard computing on the edge. Herein we present the
implementation of a segmentation model applied to the Corsican Fire Database.
We optimized an existing U-Net model for such a task and ported the model to an
edge device (a Xilinx Ultra96-v2 FPGA). By pruning and quantizing the original
model, we reduce the number of parameters by 90%. Furthermore, additional
optimizations enabled us to increase the throughput of the original model from
8 frames per second (FPS) to 33.63 FPS without loss in the segmentation
performance: our model obtained 0.912 in Matthews correlation coefficient
(MCC),0.915 in F1 score and 0.870 in Hafiane quality index (HAF), and
comparable qualitative segmentation results when contrasted to the original
full-precision model. The final model was integrated into a low-cost FPGA,
which was used to implement a neural network accelerator.
Related papers
- Designing a Classifier for Active Fire Detection from Multispectral Satellite Imagery Using Neural Architecture Search [0.0]
This paper showcases the use of a reinforcement learning-based Neural Architecture Search (NAS) agent to design a small neural network to perform active fire detection on multispectral satellite imagery.
Specifically, we aim to design a neural network that can determine if a single multispectral pixel is a part of a fire, and do so within the constraints of a Low Earth Orbit (LEO) nanosatellite with a limited power budget.
arXiv Detail & Related papers (2024-10-07T18:43:43Z) - Accurate, Low-latency, Efficient SAR Automatic Target Recognition on
FPGA [3.251765107970636]
Synthetic aperture radar (SAR) automatic target recognition (ATR) is the key technique for remote-sensing image recognition.
The state-of-the-art convolutional neural networks (CNNs) for SAR ATR suffer from emphhigh computation cost and emphlarge memory footprint.
We propose a comprehensive GNN-based model-architecture co-design on FPGA to address the above issues.
arXiv Detail & Related papers (2023-01-04T05:35:30Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Rethinking Deconvolution for 2D Human Pose Estimation Light yet Accurate
Model for Real-time Edge Computing [0.0]
This system was found to be very accurate and achieved a 94.5% accuracy of SOTA HRNet 256x192.
Our model adopts an encoder-decoder architecture and is carefully downsized to improve its efficiency.
arXiv Detail & Related papers (2021-11-08T01:44:46Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - EfficientPose: Efficient Human Pose Estimation with Neural Architecture
Search [47.30243595690131]
We propose an efficient framework targeted at human pose estimation including two parts, the efficient backbone and the efficient head.
Our smallest model has only 0.65 GFLOPs with 88.1% PCKh@0.5 on MPII and our large model has only 2 GFLOPs while its accuracy is competitive with the state-of-the-art large model.
arXiv Detail & Related papers (2020-12-13T15:38:38Z) - KutralNet: A Portable Deep Learning Model for Fire Recognition [4.886882441164088]
We propose a new deep learning architecture that requires fewer floating-point operations (flops) for fire recognition.
We also propose a portable approach for fire recognition and the use of modern techniques to reduce the model's computational cost.
One of our models presents 71% fewer parameters than FireNet, while still presenting competitive accuracy and AUROC performance.
arXiv Detail & Related papers (2020-08-16T09:35:25Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z) - Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features.
We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.