Q-Segment: Segmenting Images In-Sensor for Vessel-Based Medical
Diagnosis
- URL: http://arxiv.org/abs/2312.09854v3
- Date: Mon, 4 Mar 2024 15:21:18 GMT
- Title: Q-Segment: Segmenting Images In-Sensor for Vessel-Based Medical
Diagnosis
- Authors: Pietro Bonazzi, Yawei Li, Sizhen Bian, Michele Magno
- Abstract summary: We present "Q-Segment", a quantized real-time segmentation algorithm, and conduct a comprehensive evaluation on a low-power edge vision platform with the Sony IMX500.
Q-Segment achieves ultra-low inference time in-sensor only 0.23 ms and power consumption of only 72mW.
This research contributes valuable insights into edge-based image segmentation, laying the foundation for efficient algorithms tailored to low-power environments.
- Score: 13.018482089796159
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the growing interest in deploying deep learning models
directly in-sensor. We present "Q-Segment", a quantized real-time segmentation
algorithm, and conduct a comprehensive evaluation on a low-power edge vision
platform with an in-sensors processor, the Sony IMX500. One of the main goals
of the model is to achieve end-to-end image segmentation for vessel-based
medical diagnosis. Deployed on the IMX500 platform, Q-Segment achieves
ultra-low inference time in-sensor only 0.23 ms and power consumption of only
72mW. We compare the proposed network with state-of-the-art models, both float
and quantized, demonstrating that the proposed solution outperforms existing
networks on various platforms in computing efficiency, e.g., by a factor of 75x
compared to ERFNet. The network employs an encoder-decoder structure with skip
connections, and results in a binary accuracy of 97.25% and an Area Under the
Receiver Operating Characteristic Curve (AUC) of 96.97% on the CHASE dataset.
We also present a comparison of the IMX500 processing core with the Sony
Spresense, a low-power multi-core ARM Cortex-M microcontroller, and a
single-core ARM Cortex-M4 showing that it can achieve in-sensor processing with
end-to-end low latency (17 ms) and power concumption (254mW). This research
contributes valuable insights into edge-based image segmentation, laying the
foundation for efficient algorithms tailored to low-power environments.
Related papers
- LHU-Net: A Light Hybrid U-Net for Cost-Efficient, High-Performance Volumetric Medical Image Segmentation [4.168081528698768]
We introduce LHU-Net, a streamlined Hybrid U-Net for medical image segmentation.
Tested on five benchmark datasets, LHU-Net demonstrated superior efficiency and accuracy.
arXiv Detail & Related papers (2024-04-07T22:58:18Z) - Gesture Recognition for FMCW Radar on the Edge [0.0]
We show that gestures can be characterized efficiently by a set of five features.
A recurrent neural network (RNN) based architecture exploits these features to jointly detect and classify five different gestures.
The proposed system recognizes gestures with an F1 score of 98.4% on our hold-out test dataset.
arXiv Detail & Related papers (2023-10-13T06:03:07Z) - Implementation of a perception system for autonomous vehicles using a
detection-segmentation network in SoC FPGA [0.0]
We have used the MultiTaskV3 detection-segmentation network as the basis for a perception system that can perform both functionalities within a single architecture.
The whole system consumes relatively little power compared to a CPU-based implementation.
It also achieves an accuracy higher than 97% of the mAP for object detection and above 90% of the mIoU for image segmentation.
arXiv Detail & Related papers (2023-07-17T17:44:18Z) - Ultra-low Power Deep Learning-based Monocular Relative Localization
Onboard Nano-quadrotors [64.68349896377629]
This work presents a novel autonomous end-to-end system that addresses the monocular relative localization, through deep neural networks (DNNs), of two peer nano-drones.
To cope with the ultra-constrained nano-drone platform, we propose a vertically-integrated framework, including dataset augmentation, quantization, and system optimizations.
Experimental results show that our DNN can precisely localize a 10cm-size target nano-drone by employing only low-resolution monochrome images, up to 2m distance.
arXiv Detail & Related papers (2023-03-03T14:14:08Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Global Context Vision Transformers [78.5346173956383]
We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization for computer vision.
We address the lack of the inductive bias in ViTs, and propose to leverage a modified fused inverted residual blocks in our architecture.
Our proposed GC ViT achieves state-of-the-art results across image classification, object detection and semantic segmentation tasks.
arXiv Detail & Related papers (2022-06-20T18:42:44Z) - Rethinking BiSeNet For Real-time Semantic Segmentation [6.622485130017622]
BiSeNet has been proved to be a popular two-stream network for real-time segmentation.
We propose a novel structure named Short-Term Dense Concatenate network (STDC) by removing structure redundancy.
arXiv Detail & Related papers (2021-04-27T13:49:47Z) - Real-time Semantic Segmentation with Fast Attention [94.88466483540692]
We propose a novel architecture for semantic segmentation of high-resolution images and videos in real-time.
The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism.
We show that results on multiple datasets demonstrate superior performance with better accuracy and speed compared to existing approaches.
arXiv Detail & Related papers (2020-07-07T22:37:16Z) - AnalogNet: Convolutional Neural Network Inference on Analog Focal Plane
Sensor Processors [0.0]
We present a high-speed, energy-efficient Convolutional Neural Network (CNN) architecture utilising the capabilities of a unique class of devices known as analog Plane Sensor Processors (FPSP)
Unlike traditional vision systems, where the sensor array sends collected data to a separate processor for processing, FPSPs allow data to be processed on the imaging device itself.
Our proposed architecture, coined AnalogNet, reaches a testing accuracy of 96.9% on the MNIST handwritten digits recognition task, at a speed of 2260 FPS, for a cost of 0.7 mJ per frame.
arXiv Detail & Related papers (2020-06-02T16:44:43Z) - Near-chip Dynamic Vision Filtering for Low-Bandwidth Pedestrian
Detection [99.94079901071163]
This paper presents a novel end-to-end system for pedestrian detection using Dynamic Vision Sensors (DVSs)
We target applications where multiple sensors transmit data to a local processing unit, which executes a detection algorithm.
Our detector is able to perform a detection every 450 ms, with an overall testing F1 score of 83%.
arXiv Detail & Related papers (2020-04-03T17:36:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.