Related papers: Efficient Multitask Dense Predictor via Binarization

Efficient Multitask Dense Predictor via Binarization

URL: http://arxiv.org/abs/2405.14136v1
Date: Thu, 23 May 2024 03:19:23 GMT
Title: Efficient Multitask Dense Predictor via Binarization
Authors: Yuzhang Shang, Dan Xu, Gaowen Liu, Ramana Rao Kompella, Yan Yan,
Abstract summary: We introduce network binarization to compress resource-intensive multi-task dense predictors. We propose a Binary Multi-task Dense Predictor, Bi-MTDP, and several variants of Bi-MTDP. One variant of Bi-MTDP outperforms full-precision (FP) multi-task dense prediction SoTAs, ARTC (CNN-based) and InvPT (ViT-Based)
Score: 19.5100813204537
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-task learning for dense prediction has emerged as a pivotal area in computer vision, enabling simultaneous processing of diverse yet interrelated pixel-wise prediction tasks. However, the substantial computational demands of state-of-the-art (SoTA) models often limit their widespread deployment. This paper addresses this challenge by introducing network binarization to compress resource-intensive multi-task dense predictors. Specifically, our goal is to significantly accelerate multi-task dense prediction models via Binary Neural Networks (BNNs) while maintaining and even improving model performance at the same time. To reach this goal, we propose a Binary Multi-task Dense Predictor, Bi-MTDP, and several variants of Bi-MTDP, in which a multi-task dense predictor is constructed via specified binarized modules. Our systematical analysis of this predictor reveals that performance drop from binarization is primarily caused by severe information degradation. To address this issue, we introduce a deep information bottleneck layer that enforces representations for downstream tasks satisfying Gaussian distribution in forward propagation. Moreover, we introduce a knowledge distillation mechanism to correct the direction of information flow in backward propagation. Intriguingly, one variant of Bi-MTDP outperforms full-precision (FP) multi-task dense prediction SoTAs, ARTC (CNN-based) and InvPT (ViT-Based). This result indicates that Bi-MTDP is not merely a naive trade-off between performance and efficiency, but is rather a benefit of the redundant information flow thanks to the multi-task architecture. Code is available at https://github.com/42Shawn/BiMTDP.

Related papers

BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN) We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations. Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z)
BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks. BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z)
The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention [6.659719111319061]
We propose an effective upsampling method and an efficient attention computation strategy to transfer the success of the binary neural networks (BNN) from single prediction tasks to dense prediction tasks. Our attention method can reduce the computational complexity by a factor of one hundred times but retain the original effect.
arXiv Detail & Related papers (2024-05-28T03:12:33Z)
BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials [27.573329030086676]
This work proposes the first Binary Pretrained Foundation Transformer (BiPFT) for natural language understanding (NLU) tasks. BiPFT exhibits a substantial enhancement in the learning capabilities of binary neural networks (BNNs) Extensive experiments validate the effectiveness of BiPFTs, surpassing task-specific baseline by 15.4% average performance on the GLUE benchmark.
arXiv Detail & Related papers (2023-12-14T13:42:57Z)
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt [67.8934749027315]
We propose a unified framework for graph hybrid pre-training which injects the task identification and position identification into GNNs. We also propose a novel pre-training paradigm based on a group of $k$-nearest neighbors.
arXiv Detail & Related papers (2023-10-23T12:11:13Z)
ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple yet General Complementary Transformer [91.43066633305662]
We propose a novel underlineComPlementary underlinetransformer, textbfComPtr, for diverse bi-source dense prediction tasks. ComPtr treats different inputs equally and builds an efficient dense interaction model in the form of sequence-to-sequence on top of the transformer.
arXiv Detail & Related papers (2023-07-23T15:17:45Z)
A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation [31.671655267992683]
This paper presents a multi-task scheme that achieves mutual assistance by means of Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA) A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner. Experiments on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes.
arXiv Detail & Related papers (2022-08-25T10:46:29Z)
Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM) In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug. Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.