Related papers: Quantized neural network for complex hologram generation

Quantized neural network for complex hologram generation

URL: http://arxiv.org/abs/2409.06711v2
Date: Thu, 31 Oct 2024 17:48:06 GMT
Title: Quantized neural network for complex hologram generation
Authors: Yutaka Endo, Minoru Oikawa, Timothy D. Wilkinson, Tomoyoshi Shimobaba, Tomoyoshi Ito,
Abstract summary: Computer-generated holography (CGH) is a promising technology for augmented reality displays, such as head-mounted or head-up displays. Recent efforts to integrate neural networks into CGH have successfully accelerated computing speed. We developed a lightweight model for complex hologram generation by introducing neural network quantization.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Computer-generated holography (CGH) is a promising technology for augmented reality displays, such as head-mounted or head-up displays. However, its high computational demand makes it impractical for implementation. Recent efforts to integrate neural networks into CGH have successfully accelerated computing speed, demonstrating the potential to overcome the trade-off between computational cost and image quality. Nevertheless, deploying neural network-based CGH algorithms on computationally limited embedded systems requires more efficient models with lower computational cost, memory footprint, and power consumption. In this study, we developed a lightweight model for complex hologram generation by introducing neural network quantization. Specifically, we built a model based on tensor holography and quantized it from 32-bit floating-point precision (FP32) to 8-bit integer precision (INT8). Our performance evaluation shows that the proposed INT8 model achieves hologram quality comparable to that of the FP32 model while reducing the model size by approximately 70% and increasing the speed fourfold. Additionally, we implemented the INT8 model on a system-on-module to demonstrate its deployability on embedded platforms and high power efficiency.

Related papers

MobileHolo: A Lightweight Complex-Valued Deformable CNN for High-Quality Computer-Generated Hologram [0.0]
Deep learning-based methods play an important role in computer-generated holograms (CGH)<n>Here, we design complex-valued deformable convolution for integration into network.<n>Method has a peak signal-to-noise ratio that is 2.04 dB, 5.31 dB, and 9.71 dB higher than that of CCNN-CGH, HoloNet, and Holo-encoder.
arXiv Detail & Related papers (2025-06-17T14:02:41Z)
Bruno: Backpropagation Running Undersampled for Novel device Optimization [37.69303106863453]
We present a bottom-up approach to train neural networks for hardware based on spiking neurons and synapses built on ferroelectric non-volatile devices (RRAM)<n>The training algorithm is then tested on a dataset with a network composed of quantized synapses based on RRAM and ferroelectric integrate-and-fire neurons.
arXiv Detail & Related papers (2025-05-23T12:06:43Z)
Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference [3.7687375904925484]
We propose a novel hardware-efficient quantization and inference scheme that exploits hardware advantages with minimal accuracy degradation.<n>We develop a novel quantization algorithm, dubbed Dual Precision Quantization (DPQ), that leverages the unique structure of our scheme without introducing additional inference overhead.
arXiv Detail & Related papers (2025-05-20T17:26:12Z)
Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference Engines [13.398758600007188]
"Fake quantization", which simulates lower operations during inference, does not actually reduce model size or improve real-world speed. "Post-training quantization" (PTQ) framework successfully implements true 8-bit quantization on state-of-the-art (SOTA) 3D medical segmentation models.
arXiv Detail & Related papers (2025-01-28T23:29:40Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging [3.502427552446068]
Deep learning models enable real-time inference, but can be computationally demanding due to complex architectures and large matrix operations. This makes DL models ill-suited for direct implementation on field-programmable gate array (FPGA)-based camera hardware. In this work, we focus on compressing recurrent neural networks (RNNs), which are well-suited for FLI time-series data processing, to enable deployment on resource-constrained FPGA boards.
arXiv Detail & Related papers (2024-10-01T17:23:26Z)
Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z)
An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable. Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks. We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z)
Low-bit Shift Network for End-to-End Spoken Language Understanding [7.851607739211987]
We propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights.
arXiv Detail & Related papers (2022-07-15T14:34:22Z)
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation [2.4149105714758545]
We propose a novel framework referred to as the Fixed-Point Quantizer of deep neural Networks (FxP-QNet) FxP-QNet adapts the quantization level for each data-structure of each layer based on the trade-off between the network accuracy and the low-precision requirements. Results show that FxP-QNet-quantized AlexNet, VGG-16, and ResNet-18 reduce the overall memory requirements of their full-precision counterparts by 7.16x, 10.36x, and 6.44x with less than 0.95%, 0.95%, and 1.99%
arXiv Detail & Related papers (2022-03-22T23:01:43Z)
Real-time Neural-MPC: Deep Learning Model Predictive Control for Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline. We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
On the quantization of recurrent neural networks [9.549757800469196]
quantization of neural networks can be defined as the approximation of the high precision computation of the canonical neural network formulation. We present an integer-only quantization strategy for Long Short-Term Memory (LSTM) neural network topologies.
arXiv Detail & Related papers (2021-01-14T04:25:08Z)
Degree-Quant: Quantization-Aware Training for Graph Neural Networks [10.330195866109312]
Graph neural networks (GNNs) have demonstrated strong performance on a wide variety of tasks. Despite their promise, there exists little research exploring methods to make them more efficient at inference time. We propose an architecturally-agnostic method, Degree-Quant, to improve performance over existing quantization-aware training baselines.
arXiv Detail & Related papers (2020-08-11T20:53:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.