Resource Constrained Semantic Segmentation for Waste Sorting
- URL: http://arxiv.org/abs/2310.19407v1
- Date: Mon, 30 Oct 2023 10:19:40 GMT
- Title: Resource Constrained Semantic Segmentation for Waste Sorting
- Authors: Elisa Cascina, Andrea Pellegrino, Lorenzo Tozzi
- Abstract summary: We propose resource-constrained semantic segmentation models for segmenting recyclable waste in industrial settings.
We perform experiments on three networks: ICNet, BiSeNet (Xception39 backbone), and ENet.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work addresses the need for efficient waste sorting strategies in
Materials Recovery Facilities to minimize the environmental impact of rising
waste. We propose resource-constrained semantic segmentation models for
segmenting recyclable waste in industrial settings. Our goal is to develop
models that fit within a 10MB memory constraint, suitable for edge applications
with limited processing capacity. We perform the experiments on three networks:
ICNet, BiSeNet (Xception39 backbone), and ENet. Given the aforementioned
limitation, we implement quantization and pruning techniques on the broader
nets, achieving positive results while marginally impacting the Mean IoU
metric. Furthermore, we propose a combination of Focal and Lov\'asz loss that
addresses the implicit class imbalance resulting in better performance compared
with the Cross-entropy loss function.
Related papers
- COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes [9.970265640589966]
We introduce an efficacious segmentation network, named COSNet, that uses boundary cues along with multi-contextual information to accurately segment the objects in cluttered scenes.
Our COSNet achieves a significant gain of 1.8% on ZeroWaste-f and 2.1% on SpectralWaste datasets respectively in terms of mIoU metric.
arXiv Detail & Related papers (2024-10-31T17:03:38Z) - Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.
We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.
Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - WasteGAN: Data Augmentation for Robotic Waste Sorting through Generative Adversarial Networks [7.775894876221921]
We introduce a data augmentation method based on a novel GAN architecture called wasteGAN.
The proposed method allows to increase the performance of semantic segmentation models, starting from a very limited bunch of labeled examples.
We then leverage the higher-quality segmentation masks predicted from models trained on the wasteGAN synthetic data to compute semantic-aware grasp poses.
arXiv Detail & Related papers (2024-09-25T15:04:21Z) - Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric [99.19559537966538]
DML aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval.
To maintain the structure of embedding space and avoid feature collapse, we propose a novel loss function called Anti-Collapse Loss.
Comprehensive experiments on benchmark datasets demonstrate that our proposed method outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-07-03T13:44:20Z) - LaCoOT: Layer Collapse through Optimal Transport [5.869633234882029]
We present an optimal transport method to reduce the depth of over-parametrized deep neural networks.
We show that minimizing this distance enables the complete removal of intermediate layers in the network, with almost no performance loss and without requiring any finetuning.
arXiv Detail & Related papers (2024-06-13T09:03:53Z) - UniPTS: A Unified Framework for Proficient Post-Training Sparsity [67.16547529992928]
Post-training Sparsity (PTS) is a newly emerged avenue that chases efficient network sparsity with limited data in need.
In this paper, we attempt to reconcile this disparity by transposing three cardinal factors that profoundly alter the performance of conventional sparsity into the context of PTS.
Our framework, termed UniPTS, is validated to be much superior to existing PTS methods across extensive benchmarks.
arXiv Detail & Related papers (2024-05-29T06:53:18Z) - Learnable Mixed-precision and Dimension Reduction Co-design for
Low-storage Activation [9.838135675969026]
Deep convolutional neural networks (CNNs) have achieved many eye-catching results.
deploying CNNs on resource-constrained edge devices is constrained by limited memory bandwidth for transmitting large intermediated data during inference.
We propose a learnable mixed-precision and dimension reduction co-design system, which separates channels into groups and allocates compression policies according to their importance.
arXiv Detail & Related papers (2022-07-16T12:53:52Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection [50.29565896287595]
We apply transfer learning to exploit common datasets for sarcasm detection.
We propose a generalized latent optimization strategy that allows different losses to accommodate each other.
In particular, we achieve 10.02% absolute performance gain over the previous state of the art on the iSarcasm dataset.
arXiv Detail & Related papers (2021-04-19T13:07:52Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.