A Machine Learning Imaging Core using Separable FIR-IIR Filters
- URL: http://arxiv.org/abs/2001.00630v1
- Date: Thu, 2 Jan 2020 21:24:26 GMT
- Title: A Machine Learning Imaging Core using Separable FIR-IIR Filters
- Authors: Masayoshi Asama, Leo F. Isikdogan, Sushma Rao, Bhavin V. Nayak, Gilad
Michael
- Abstract summary: We use a fully trainable, fixed-topology neural network to build a model that can perform a wide variety of image processing tasks.
Our proposed Machine Learning Imaging Core, dubbed MagIC, uses a silicon area of 3mm2.
Each MagIC core consumes 56mW (215 mW max power) at 500MHz and achieves an energy-efficient throughput of 23TOPS/W/mm2.
- Score: 2.099922236065961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose fixed-function neural network hardware that is designed to perform
pixel-to-pixel image transformations in a highly efficient way. We use a fully
trainable, fixed-topology neural network to build a model that can perform a
wide variety of image processing tasks. Our model uses compressed skip lines
and hybrid FIR-IIR blocks to reduce the latency and hardware footprint. Our
proposed Machine Learning Imaging Core, dubbed MagIC, uses a silicon area of
~3mm^2 (in TSMC 16nm), which is orders of magnitude smaller than a comparable
pixel-wise dense prediction model. MagIC requires no DDR bandwidth, no SRAM,
and practically no external memory. Each MagIC core consumes 56mW (215 mW max
power) at 500MHz and achieves an energy-efficient throughput of 23TOPS/W/mm^2.
MagIC can be used as a multi-purpose image processing block in an imaging
pipeline, approximating compute-heavy image processing applications, such as
image deblurring, denoising, and colorization, within the power and silicon
area limits of mobile devices.
Related papers
- HiRISE: High-Resolution Image Scaling for Edge ML via In-Sensor Compression and Selective ROI [1.3757956340051605]
We propose a high-resolution image scaling system for edge machine learning (ML) called HiRISE.
Our methodology achieves up to 17.7x reduction in data transfer and energy consumption.
arXiv Detail & Related papers (2024-07-23T16:26:05Z) - Parameter-Inverted Image Pyramid Networks [49.35689698870247]
We propose a novel network architecture known as the Inverted Image Pyramid Networks (PIIP)
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid.
PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification.
arXiv Detail & Related papers (2024-06-06T17:59:10Z) - DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis [56.849285913695184]
Diffusion Mamba (DiM) is a sequence model for efficient high-resolution image synthesis.
DiM architecture achieves inference-time efficiency for high-resolution images.
Experiments demonstrate the effectiveness and efficiency of our DiM.
arXiv Detail & Related papers (2024-05-23T06:53:18Z) - LKFormer: Large Kernel Transformer for Infrared Image Super-Resolution [5.478440050117844]
We propose a potent Transformer model, termed Large Kernel Transformer (LKFormer) to capture infrared images.
This mainly employs depth-wise convolution with large kernels to execute non-local feature modeling.
We have devised a novel feed-forward network structure called Gated-Pixel Feed-Forward Network (GPFN) to augment the LKFormer's capacity to manage the information flow within the network.
arXiv Detail & Related papers (2024-01-22T11:28:24Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning [114.66037224769005]
We present a novel MicroISP model designed specifically for edge devices.
The proposed solution is capable of processing up to 32MP photos on recent smartphones using the standard mobile ML libraries.
The architecture of the model is flexible, allowing to adjust its complexity to devices of different computational power.
arXiv Detail & Related papers (2022-11-08T17:40:50Z) - Toward Efficient Hyperspectral Image Processing inside Camera Pixels [1.6449390849183356]
Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands.
To mitigate this problem, we propose a form of processing-in-pixel (PIP)
Our PIP-optimized custom CNN layers effectively compress the input data, significantly reducing the bandwidth required to transmit the data downstream to the HSI processing unit.
arXiv Detail & Related papers (2022-03-11T01:06:02Z) - P2M: A Processing-in-Pixel-in-Memory Paradigm for Resource-Constrained
TinyML Applications [4.102356304183255]
High-resolution input images still need to be streamed between the camera and the AI processing unit, frame by frame, causing energy, bandwidth, and security bottlenecks.
We propose a novel Processing-in-Pixel-in-memory (P2M) paradigm, that customizes the pixel array by adding support for analog multi-channel, multi-bit convolution and ReLU.
Our results indicate that P2M reduces data transfer bandwidth from sensors and analog to digital conversions by 21x, and the energy-delay product (EDP) incurred in processing a MobileNetV2 model on a TinyML
arXiv Detail & Related papers (2022-03-07T04:15:29Z) - Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years.
We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.
Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z) - Interleaving: Modular architectures for fault-tolerant photonic quantum
computing [50.591267188664666]
Photonic fusion-based quantum computing (FBQC) uses low-loss photonic delays.
We present a modular architecture for FBQC in which these components are combined to form "interleaving modules"
Exploiting the multiplicative power of delays, each module can add thousands of physical qubits to the computational Hilbert space.
arXiv Detail & Related papers (2021-03-15T18:00:06Z) - An Ultra Fast Low Power Convolutional Neural Network Image Sensor with
Pixel-level Computing [3.41234610095684]
This paper proposes a Processing-In-Pixel (PIP) CMOS sensor architecture, which allows convolution operation before the column readout circuit to significantly improve the image reading speed.
In other words, the computational efficiency is 4.75 TOPS/w, which is about 3.6 times as higher as the state-of-the-art.
arXiv Detail & Related papers (2021-01-09T07:10:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.