A Mixed Quantization Network for Computationally Efficient Mobile
Inverse Tone Mapping
- URL: http://arxiv.org/abs/2203.06504v1
- Date: Sat, 12 Mar 2022 19:40:01 GMT
- Title: A Mixed Quantization Network for Computationally Efficient Mobile
Inverse Tone Mapping
- Authors: Juan Borrego-Carazo, Mete Ozay, Frederik Laboyrie, Paul Wisbey
- Abstract summary: We propose combining efficient operations of deep neural networks with a novel mixed quantization scheme to construct a well-performing but computationally efficient mixed quantization network (MQN)
MQN provides up to 10 times improvement on latency and 25 times improvement on memory consumption.
- Score: 8.277567852741242
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recovering a high dynamic range (HDR) image from a single low dynamic range
(LDR) image, namely inverse tone mapping (ITM), is challenging due to the lack
of information in over- and under-exposed regions. Current methods focus
exclusively on training high-performing but computationally inefficient ITM
models, which in turn hinder deployment of the ITM models in
resource-constrained environments with limited computing power such as edge and
mobile device applications.
To this end, we propose combining efficient operations of deep neural
networks with a novel mixed quantization scheme to construct a well-performing
but computationally efficient mixed quantization network (MQN) which can
perform single image ITM on mobile platforms. In the ablation studies, we
explore the effect of using different attention mechanisms, quantization
schemes, and loss functions on the performance of MQN in ITM tasks. In the
comparative analyses, ITM models trained using MQN perform on par with the
state-of-the-art methods on benchmark datasets. MQN models provide up to 10
times improvement on latency and 25 times improvement on memory consumption.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Up-sampling-only and Adaptive Mesh-based GNN for Simulating Physical Systems [7.384641647468888]
We develop a novel hierarchical Mesh Graph Network, namely UA-MGN, for efficient and effective mechanical simulation.
Evaluation on two synthetic and one real datasets demonstrates the superiority of the UA-MGN.
arXiv Detail & Related papers (2024-09-07T07:09:58Z) - Full-Stack Optimization for CAM-Only DNN Inference [2.0837295518447934]
This paper explores the combination of algorithmic optimizations for ternary weight neural networks and associative processors.
We propose a novel compilation flow to optimize convolutions on APs by reducing their arithmetic intensity.
Our solution improves the energy efficiency of ResNet-18 inference on ImageNet by 7.5x compared to crossbar in-memory accelerators.
arXiv Detail & Related papers (2024-01-23T10:27:38Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource
Visual Question Answering [66.05768870785548]
Finetuning pretrained Vision-Language Models (VLMs) has been a prevailing paradigm for achieving state-of-the-art performance in Visual Question Answering (VQA)
Current parameter-efficient tuning methods dramatically reduce the number of tunable parameters, but there still exists a significant performance gap with full finetuning.
We propose MixPHM, a redundancy-aware parameter-efficient tuning method that outperforms full finetuning in low-resource VQA.
arXiv Detail & Related papers (2023-03-02T13:28:50Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms.
Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics.
We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.