HiRISE: High-Resolution Image Scaling for Edge ML via In-Sensor Compression and Selective ROI
- URL: http://arxiv.org/abs/2408.03956v1
- Date: Tue, 23 Jul 2024 16:26:05 GMT
- Title: HiRISE: High-Resolution Image Scaling for Edge ML via In-Sensor Compression and Selective ROI
- Authors: Brendan Reidy, Sepehr Tabrizchi, Mohamadreza Mohammadi, Shaahin Angizi, Arman Roohi, Ramtin Zand,
- Abstract summary: We propose a high-resolution image scaling system for edge machine learning (ML) called HiRISE.
Our methodology achieves up to 17.7x reduction in data transfer and energy consumption.
- Score: 1.3757956340051605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rise of tiny IoT devices powered by machine learning (ML), many researchers have directed their focus toward compressing models to fit on tiny edge devices. Recent works have achieved remarkable success in compressing ML models for object detection and image classification on microcontrollers with small memory, e.g., 512kB SRAM. However, there remain many challenges prohibiting the deployment of ML systems that require high-resolution images. Due to fundamental limits in memory capacity for tiny IoT devices, it may be physically impossible to store large images without external hardware. To this end, we propose a high-resolution image scaling system for edge ML, called HiRISE, which is equipped with selective region-of-interest (ROI) capability leveraging analog in-sensor image scaling. Our methodology not only significantly reduces the peak memory requirements, but also achieves up to 17.7x reduction in data transfer and energy consumption.
Related papers
- LiVOS: Light Video Object Segmentation with Gated Linear Matching [116.58237547253935]
LiVOS is a lightweight memory network that employs linear matching via linear attention.
For longer and higher-resolution videos, it matched STM-based methods with 53% less GPU memory and supports 4096p inference on a 32G consumer-grade GPU.
arXiv Detail & Related papers (2024-11-05T05:36:17Z) - Mondrian: On-Device High-Performance Video Analytics with Compressive
Packed Inference [7.624476059109304]
Mondrian is an edge system that enables high-performance object detection on high-resolution video streams.
We devise a novel Compressive Packed Inference to minimize per-pixel processing costs.
arXiv Detail & Related papers (2024-03-12T12:35:12Z) - MCUFormer: Deploying Vision Transformers on Microcontrollers with
Limited Memory [76.02294791513552]
We propose a hardware-algorithm co-optimizations method called MCUFormer to deploy vision transformers on microcontrollers with extremely limited memory.
Experimental results demonstrate that our MCUFormer achieves 73.62% top-1 accuracy on ImageNet for image classification with 320KB memory.
arXiv Detail & Related papers (2023-10-25T18:00:26Z) - Memory-Constrained Semantic Segmentation for Ultra-High Resolution UAV
Imagery [35.96063342025938]
This paper explores the intricate problem of achieving efficient and effective segmentation of ultra-high resolution UAV imagery.
We propose a GPU memory-efficient and effective framework for local inference without accessing the context beyond local patches.
We present an efficient memory-based interaction scheme to correct potential semantic bias of the underlying high-resolution information.
arXiv Detail & Related papers (2023-10-07T07:44:59Z) - MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning [114.66037224769005]
We present a novel MicroISP model designed specifically for edge devices.
The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries.
The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power.
arXiv Detail & Related papers (2022-11-08T17:40:50Z) - Iterative Patch Selection for High-Resolution Image Recognition [10.847032625429717]
We propose a simple method, Iterative Patch Selection (IPS), which decouples the memory usage from the input size.
IPS achieves this by selecting only the most salient patches, which are then aggregated into a global representation for image recognition.
Our method demonstrates strong performance and has wide applicability across different domains, training regimes and image sizes while using minimal accelerator memory.
arXiv Detail & Related papers (2022-10-24T07:55:57Z) - Memory-Oriented Design-Space Exploration of Edge-AI Hardware for XR
Applications [5.529817156718514]
Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse.
In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration.
For both applications, we train deep neural networks and analyze the impact of quantization and hardware specific bottlenecks.
The impact of integrating state-of-the-art emerging non-volatile memory technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated.
arXiv Detail & Related papers (2022-06-08T11:18:02Z) - A TinyML Platform for On-Device Continual Learning with Quantized Latent
Replays [66.62377866022221]
Latent Replay-based Continual Learning (CL) techniques enable online, serverless adaptation in principle.
We introduce a HW/SW platform for end-to-end CL based on a 10-core FP32-enabled parallel ultra-low-power processor.
Our results show that by combining these techniques, continual learning can be achieved in practice using less than 64MB of memory.
arXiv Detail & Related papers (2021-10-20T11:01:23Z) - FOVEA: Foveated Image Magnification for Autonomous Navigation [53.69803081925454]
We propose an attentional approach that elastically magnifies certain regions while maintaining a small input canvas.
Our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
On the autonomous driving datasets Argoverse-HD and BDD100K, we show our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
arXiv Detail & Related papers (2021-08-27T03:07:55Z) - Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years.
We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.
Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z) - A Machine Learning Imaging Core using Separable FIR-IIR Filters [2.099922236065961]
We use a fully trainable, fixed-topology neural network to build a model that can perform a wide variety of image processing tasks.
Our proposed Machine Learning Imaging Core, dubbed MagIC, uses a silicon area of 3mm2.
Each MagIC core consumes 56mW (215 mW max power) at 500MHz and achieves an energy-efficient throughput of 23TOPS/W/mm2.
arXiv Detail & Related papers (2020-01-02T21:24:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.