Related papers: Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks

URL: http://arxiv.org/abs/2107.01354v1
Date: Sat, 3 Jul 2021 06:31:54 GMT
Title: Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural Networks
Authors: Hakbin Kim and Dong-Wan Choi
Abstract summary: This paper proposes a framework, called Pool of Experts (PoE), that instantly builds a lightweight and task-specific model without any training process. For a realtime model querying service, PoE first extracts a pool of primitive components, called experts, from a well-trained and sufficiently generic network. PoE can build a fairly accurate yet compact model in a realtime manner, whereas it takes a few minutes per query for the other training methods to achieve a similar level of accuracy.
Score: 0.20305676256390928
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In spite of the great success of deep learning technologies, training and delivery of a practically serviceable model is still a highly time-consuming process. Furthermore, a resulting model is usually too generic and heavyweight, and hence essentially goes through another expensive model compression phase to fit in a resource-limited device like embedded systems. Inspired by the fact that a machine learning task specifically requested by mobile users is often much simpler than it is supported by a massive generic model, this paper proposes a framework, called Pool of Experts (PoE), that instantly builds a lightweight and task-specific model without any training process. For a realtime model querying service, PoE first extracts a pool of primitive components, called experts, from a well-trained and sufficiently generic network by exploiting a novel conditional knowledge distillation method, and then performs our train-free knowledge consolidation to quickly combine necessary experts into a lightweight network for a target task. Thanks to this train-free property, in our thorough empirical study, PoE can build a fairly accurate yet compact model in a realtime manner, whereas it takes a few minutes per query for the other training methods to achieve a similar level of the accuracy.

Related papers

Convolutional Networks as Extremely Small Foundation Models: Visual Prompting and Theoretical Perspective [1.79487674052027]
In this paper, we design a prompting module which performs few-shot adaptation of generic deep networks to new tasks. Driven by learning theory, we derive prompting modules that are as simple as possible, as they generalize better under the same training error. In practice, SDForest has extremely low cost and achieves real-time even on CPU.
arXiv Detail & Related papers (2024-09-03T12:34:23Z)
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks. We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z)
Revisiting Single-gated Mixtures of Experts [13.591354795556972]
We propose to revisit the simple single-gate MoE, which allows for more practical training. Key to our work are (i) a base model branch acting both as an early-exit and an ensembling regularization scheme. We show experimentally that the proposed model obtains efficiency-to-accuracy trade-offs comparable with other more complex MoE.
arXiv Detail & Related papers (2023-04-11T21:07:59Z)
BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Search and Recommendation Models on Commodity CPU Hardware [28.05159031634185]
BOLT is a sparse deep learning library for training large-scale search and recommendation models on standard CPU hardware. We evaluate BOLT on a number of information retrieval tasks including product recommendations, text classification, graph neural networks, and personalization.
arXiv Detail & Related papers (2023-03-30T22:03:43Z)
Prismer: A Vision-Language Model with Multi-Task Experts [119.82149763682156]
Prismer is a data- and parameter-efficient vision-language model that leverages an ensemble of task-specific experts. By leveraging experts from a wide range of domains, we show Prismer can efficiently pool this expert knowledge and adapt it to various vision-language reasoning tasks. In experiments, we show that Prismer achieves fine-tuned and few-shot learning performance which is competitive with current state-of-the-arts.
arXiv Detail & Related papers (2023-03-04T21:22:47Z)
Unifying Synergies between Self-supervised Learning and Dynamic Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms. We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z)
EBJR: Energy-Based Joint Reasoning for Adaptive Inference [10.447353952054492]
State-of-the-art deep learning models have achieved significant performance levels on various benchmarks. Light-weight architectures, on the other hand, achieve moderate accuracies, but at a much more desirable latency. This paper presents a new method of jointly using the large accurate models together with the small fast ones.
arXiv Detail & Related papers (2021-10-20T02:33:31Z)
Improving the Accuracy of Early Exits in Multi-Exit Architectures via Curriculum Learning [88.17413955380262]
Multi-exit architectures allow deep neural networks to terminate their execution early in order to adhere to tight deadlines at the cost of accuracy. We introduce a novel method called Multi-Exit Curriculum Learning that utilizes curriculum learning. Our method consistently improves the accuracy of early exits compared to the standard training approach.
arXiv Detail & Related papers (2021-04-21T11:12:35Z)
Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only. We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z)
A General Machine Learning Framework for Survival Analysis [0.8029049649310213]
Many machine learning methods for survival analysis only consider the standard setting with right-censored data and proportional hazards assumption. We present a very general machine learning framework for time-to-event analysis that uses a data augmentation strategy to reduce complex survival tasks to standard Poisson regression tasks.
arXiv Detail & Related papers (2020-06-27T20:57:18Z)
Knowledge Distillation: A Survey [87.51063304509067]
Deep neural networks have been successful in both industry and academia, especially for computer vision tasks. It is a challenge to deploy these cumbersome deep models on devices with limited resources. Knowledge distillation effectively learns a small student model from a large teacher model.
arXiv Detail & Related papers (2020-06-09T21:47:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.