Dynamic Switch Layers For Unsupervised Learning
- URL: http://arxiv.org/abs/2404.04405v1
- Date: Fri, 5 Apr 2024 21:03:11 GMT
- Title: Dynamic Switch Layers For Unsupervised Learning
- Authors: Haiguang Li, Usama Pervaiz, MichaĆ Matuszak, Robert Kamara, Gilles Roux, Trausti Thormundsson, Joseph Antognini,
- Abstract summary: On-device machine learning (ODML) enables intelligent applications on resource-constrained devices.
Power consumption poses a major challenge, forcing a trade-off between model accuracy and power efficiency.
We introduce the Dynamic Switch Layer ( DSL) to extend the benefits of GC layers to unsupervised learning scenarios.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: On-device machine learning (ODML) enables intelligent applications on resource-constrained devices. However, power consumption poses a major challenge, forcing a trade-off between model accuracy and power efficiency that often limits model complexity. The previously established Gated Compression (GC) layers offer a solution, enabling power efficiency without sacrificing model performance by selectively gating samples that lack signals of interest. However, their reliance on ground truth labels limits GC layers to supervised tasks. This work introduces the Dynamic Switch Layer (DSL), extending the benefits of GC layers to unsupervised learning scenarios, and maintaining power efficiency without the need for labeled data. The DSL builds upon the GC architecture, leveraging a dynamic pathway selection, and adapting model complexity in response to the innate structure of the data. We integrate the DSL into the SoundStream architecture and demonstrate that by routing up to 80% of samples through a lightweight pass we achieve a 12.3x reduction in the amount of computation performed and a 20.9x reduction in model size. This reduces the on-device inference latency by up to 26.5% and improves power efficiency by up to 21.4% without impacting model performance.
Related papers
- Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching [56.286064975443026]
We make an interesting and somehow surprising observation: the computation of a large proportion of layers in the diffusion transformer, through a caching mechanism, can be readily removed even without updating the model parameters.
We introduce a novel scheme, named Learningto-Cache (L2C), that learns to conduct caching in a dynamic manner for diffusion transformers.
Experimental results show that L2C largely outperforms samplers such as DDIM and DPM-r, alongside prior cache-based methods at the same inference speed.
arXiv Detail & Related papers (2024-06-03T18:49:57Z) - A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies [51.7643024367548]
Stable Diffusion Model is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation.
This study focuses on reducing redundant computation in SDM and optimizing the model through both tuning and tuning-free methods.
arXiv Detail & Related papers (2024-05-31T21:47:05Z) - Enhancing User Experience in On-Device Machine Learning with Gated Compression Layers [0.0]
On-device machine learning (ODML) enables powerful edge applications, but power consumption remains a key challenge for resource-constrained devices.
This work focuses on the use of Gated Compression (GC) layer to enhance ODML model performance while conserving power.
GC layers dynamically regulate data flow by selectively gating activations of neurons within the neural network and effectively filtering out non-essential inputs.
arXiv Detail & Related papers (2024-05-02T21:18:06Z) - LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights [2.8461446020965435]
We introduce LD-Pruner, a novel performance-preserving structured pruning method for compressing Latent Diffusion Models.
We demonstrate the effectiveness of our approach on three different tasks: text-to-image (T2I) generation, Unconditional Image Generation (UIG) and Unconditional Audio Generation (UAG)
arXiv Detail & Related papers (2024-04-18T06:35:37Z) - Radial Networks: Dynamic Layer Routing for High-Performance Large Language Models [9.637088945386227]
Large language models (LLMs) often struggle with strict memory, latency, and power demands.
Various forms of dynamic sparsity have been proposed that reduce compute on an input-by-input basis.
We propose Radial Networks, which perform token-level routing between layers guided by a trained router module.
arXiv Detail & Related papers (2024-04-07T09:52:31Z) - Lightweight Diffusion Models with Distillation-Based Block Neural
Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS)
Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher.
Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z) - Federated Learning for Energy-limited Wireless Networks: A Partial Model
Aggregation Approach [79.59560136273917]
limited communication resources, bandwidth and energy, and data heterogeneity across devices are main bottlenecks for federated learning (FL)
We first devise a novel FL framework with partial model aggregation (PMA)
The proposed PMA-FL improves 2.72% and 11.6% accuracy on two typical heterogeneous datasets.
arXiv Detail & Related papers (2022-04-20T19:09:52Z) - Learning Robust and Lightweight Model through Separable Structured
Transformations [13.208781763887947]
We propose a separable structural transformation of the fully-connected layer to reduce the parameters of convolutional neural networks.
We successfully reduce the amount of network parameters by 90%, while the robust accuracy loss is less than 1.5%.
We evaluate the proposed approach on datasets such as ImageNet, SVHN, CIFAR-100 and Vision Transformer.
arXiv Detail & Related papers (2021-12-27T07:25:26Z) - Learning to Generate Content-Aware Dynamic Detectors [62.74209921174237]
We introduce a newpective of designing efficient detectors, which is automatically generating sample-adaptive model architecture.
We introduce a course-to-fine strat-egy tailored for object detection to guide the learning of dynamic routing.
Experiments on MS-COCO dataset demonstrate that CADDet achieves 1.8 higher mAP with 10% fewer FLOPs compared with vanilla routing.
arXiv Detail & Related papers (2020-12-08T08:05:20Z) - Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped.
On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods.
On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z) - Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features.
We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.