Pool of Experts: Realtime Querying Specialized Knowledge in Massive
Neural Networks
- URL: http://arxiv.org/abs/2107.01354v1
- Date: Sat, 3 Jul 2021 06:31:54 GMT
- Title: Pool of Experts: Realtime Querying Specialized Knowledge in Massive
Neural Networks
- Authors: Hakbin Kim and Dong-Wan Choi
- Abstract summary: This paper proposes a framework, called Pool of Experts (PoE), that instantly builds a lightweight and task-specific model without any training process.
For a realtime model querying service, PoE first extracts a pool of primitive components, called experts, from a well-trained and sufficiently generic network.
PoE can build a fairly accurate yet compact model in a realtime manner, whereas it takes a few minutes per query for the other training methods to achieve a similar level of accuracy.
- Score: 0.20305676256390928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In spite of the great success of deep learning technologies, training and
delivery of a practically serviceable model is still a highly time-consuming
process. Furthermore, a resulting model is usually too generic and heavyweight,
and hence essentially goes through another expensive model compression phase to
fit in a resource-limited device like embedded systems. Inspired by the fact
that a machine learning task specifically requested by mobile users is often
much simpler than it is supported by a massive generic model, this paper
proposes a framework, called Pool of Experts (PoE), that instantly builds a
lightweight and task-specific model without any training process. For a
realtime model querying service, PoE first extracts a pool of primitive
components, called experts, from a well-trained and sufficiently generic
network by exploiting a novel conditional knowledge distillation method, and
then performs our train-free knowledge consolidation to quickly combine
necessary experts into a lightweight network for a target task. Thanks to this
train-free property, in our thorough empirical study, PoE can build a fairly
accurate yet compact model in a realtime manner, whereas it takes a few minutes
per query for the other training methods to achieve a similar level of the
accuracy.
Related papers
- Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks.
We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z) - Revisiting Single-gated Mixtures of Experts [13.591354795556972]
We propose to revisit the simple single-gate MoE, which allows for more practical training.
Key to our work are (i) a base model branch acting both as an early-exit and an ensembling regularization scheme.
We show experimentally that the proposed model obtains efficiency-to-accuracy trade-offs comparable with other more complex MoE.
arXiv Detail & Related papers (2023-04-11T21:07:59Z) - BOLT: An Automated Deep Learning Framework for Training and Deploying
Large-Scale Search and Recommendation Models on Commodity CPU Hardware [28.05159031634185]
BOLT is a sparse deep learning library for training large-scale search and recommendation models on standard CPU hardware.
We evaluate BOLT on a number of information retrieval tasks including product recommendations, text classification, graph neural networks, and personalization.
arXiv Detail & Related papers (2023-03-30T22:03:43Z) - Prismer: A Vision-Language Model with Multi-Task Experts [119.82149763682156]
Prismer is a data- and parameter-efficient vision-language model that leverages an ensemble of task-specific experts.
By leveraging experts from a wide range of domains, we show Prismer can efficiently pool this expert knowledge and adapt it to various vision-language reasoning tasks.
In experiments, we show that Prismer achieves fine-tuned and few-shot learning performance which is competitive with current state-of-the-arts.
arXiv Detail & Related papers (2023-03-04T21:22:47Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - EBJR: Energy-Based Joint Reasoning for Adaptive Inference [10.447353952054492]
State-of-the-art deep learning models have achieved significant performance levels on various benchmarks.
Light-weight architectures, on the other hand, achieve moderate accuracies, but at a much more desirable latency.
This paper presents a new method of jointly using the large accurate models together with the small fast ones.
arXiv Detail & Related papers (2021-10-20T02:33:31Z) - Distilling EEG Representations via Capsules for Affective Computing [14.67085109524245]
We propose a novel knowledge distillation pipeline to distill EEG representations via capsule-based architectures.
Our framework consistently enables student networks with different compression ratios to effectively learn from the teacher.
Our method achieves state-of-the-art results on one of the two datasets.
arXiv Detail & Related papers (2021-04-30T22:04:35Z) - Improving the Accuracy of Early Exits in Multi-Exit Architectures via
Curriculum Learning [88.17413955380262]
Multi-exit architectures allow deep neural networks to terminate their execution early in order to adhere to tight deadlines at the cost of accuracy.
We introduce a novel method called Multi-Exit Curriculum Learning that utilizes curriculum learning.
Our method consistently improves the accuracy of early exits compared to the standard training approach.
arXiv Detail & Related papers (2021-04-21T11:12:35Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z) - A General Machine Learning Framework for Survival Analysis [0.8029049649310213]
Many machine learning methods for survival analysis only consider the standard setting with right-censored data and proportional hazards assumption.
We present a very general machine learning framework for time-to-event analysis that uses a data augmentation strategy to reduce complex survival tasks to standard Poisson regression tasks.
arXiv Detail & Related papers (2020-06-27T20:57:18Z) - Knowledge Distillation: A Survey [87.51063304509067]
Deep neural networks have been successful in both industry and academia, especially for computer vision tasks.
It is a challenge to deploy these cumbersome deep models on devices with limited resources.
Knowledge distillation effectively learns a small student model from a large teacher model.
arXiv Detail & Related papers (2020-06-09T21:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.