Distributed Semantic Segmentation with Efficient Joint Source and Task Decoding
- URL: http://arxiv.org/abs/2407.11224v1
- Date: Mon, 15 Jul 2024 20:20:04 GMT
- Title: Distributed Semantic Segmentation with Efficient Joint Source and Task Decoding
- Authors: Danish Nazir, Timo Bartels, Jan Piewek, Thorsten Bagdonat, Tim Fingscheidt,
- Abstract summary: Deep neural networks (DNNs) execute one part of the network on edge devices and the other part on a large-scale cloud platform.
In this work, we propose joint source and task decoding, as it allows for a smaller network size in the cloud.
We demonstrate the effectiveness of our method by achieving a distributed semantic segmentation SOTA over a wide range of intersections.
- Score: 18.8803233019656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Distributed computing in the context of deep neural networks (DNNs) implies the execution of one part of the network on edge devices and the other part typically on a large-scale cloud platform. Conventional methods propose to employ a serial concatenation of a learned image and source encoder, the latter projecting the image encoder output (bottleneck features) into a quantized representation for bitrate-efficient transmission. In the cloud, a respective source decoder reprojects the quantized representation to the original feature representation, serving as an input for the downstream task decoder performing, e.g., semantic segmentation. In this work, we propose joint source and task decoding, as it allows for a smaller network size in the cloud. This further enables the scalability of such services in large numbers without requiring extensive computational load on the cloud per channel. We demonstrate the effectiveness of our method by achieving a distributed semantic segmentation SOTA over a wide range of bitrates on the mean intersection over union metric, while using only $9.8 \%$ ... $11.59 \%$ of cloud DNN parameters used in the previous SOTA on the COCO and Cityscapes datasets.
Related papers
- DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [62.63481844384229]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.
In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method.
Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z) - Task-aware Distributed Source Coding under Dynamic Bandwidth [24.498190179263837]
We propose a distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA)
NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead.
Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14%.
arXiv Detail & Related papers (2023-05-24T19:20:59Z) - Learning Neural Volumetric Field for Point Cloud Geometry Compression [13.691147541041804]
We propose to code the geometry of a given point cloud by learning a neural field.
We divide the entire space into small cubes and represent each non-empty cube by a neural network and an input latent code.
The network is shared among all the cubes in a single frame or multiple frames, to exploit the spatial and temporal redundancy.
arXiv Detail & Related papers (2022-12-11T19:55:24Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - SoftPool++: An Encoder-Decoder Network for Point Cloud Completion [93.54286830844134]
We propose a novel convolutional operator for the task of point cloud completion.
The proposed operator does not require any max-pooling or voxelization operation.
We show that our approach achieves state-of-the-art performance in shape completion at low and high resolutions.
arXiv Detail & Related papers (2022-05-08T15:31:36Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Dynamic Neural Representational Decoders for High-Resolution Semantic
Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD)
As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks.
This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point
Cloud Segmentation [30.736361776703568]
Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely.
Most existing methods simply stack different point attributes/modalities as image channels to increase information capacity.
We design FPS-Net, a convolutional fusion network that exploits the uniqueness and discrepancy among the projected image channels for optimal point cloud segmentation.
arXiv Detail & Related papers (2021-03-01T04:08:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.