LIDSNet: A Lightweight on-device Intent Detection model using Deep
Siamese Network
- URL: http://arxiv.org/abs/2110.15717v1
- Date: Wed, 6 Oct 2021 18:20:37 GMT
- Title: LIDSNet: A Lightweight on-device Intent Detection model using Deep
Siamese Network
- Authors: Vibhav Agarwal, Sudeep Deepak Shivnikar, Sourav Ghosh, Himanshu Arora,
Yashwant Saini
- Abstract summary: LIDSNet is a novel lightweight on-device intent detection model.
We show that our model is at least 41x lighter and 30x faster during inference than MobileBERT on Samsung Galaxy S20 device.
- Score: 2.624902795082451
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Intent detection is a crucial task in any Natural Language Understanding
(NLU) system and forms the foundation of a task-oriented dialogue system. To
build high-quality real-world conversational solutions for edge devices, there
is a need for deploying intent detection model on device. This necessitates a
light-weight, fast, and accurate model that can perform efficiently in a
resource-constrained environment. To this end, we propose LIDSNet, a novel
lightweight on-device intent detection model, which accurately predicts the
message intent by utilizing a Deep Siamese Network for learning better sentence
representations. We use character-level features to enrich the sentence-level
representations and empirically demonstrate the advantage of transfer learning
by utilizing pre-trained embeddings. Furthermore, to investigate the efficacy
of the modules in our architecture, we conduct an ablation study and arrive at
our optimal model. Experimental results prove that LIDSNet achieves
state-of-the-art competitive accuracy of 98.00% and 95.97% on SNIPS and ATIS
public datasets respectively, with under 0.59M parameters. We further benchmark
LIDSNet against fine-tuned BERTs and show that our model is at least 41x
lighter and 30x faster during inference than MobileBERT on Samsung Galaxy S20
device, justifying its efficiency on resource-constrained edge devices.
Related papers
- SlimLM: An Efficient Small Language Model for On-Device Document Assistance [60.971107009492606]
We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices.
SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist.
We evaluate SlimLM against existing SLMs, showing comparable or superior performance.
arXiv Detail & Related papers (2024-11-15T04:44:34Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Efficient Federated Intrusion Detection in 5G ecosystem using optimized BERT-based model [0.7100520098029439]
5G offers advanced services, supporting applications such as intelligent transportation, connected healthcare, and smart cities within the Internet of Things (IoT)
These advancements introduce significant security challenges, with increasingly sophisticated cyber-attacks.
This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs)
arXiv Detail & Related papers (2024-09-28T15:56:28Z) - MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development.
This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z) - Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices.
The impact of on-device storage on the performance of FL is still not explored.
In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z) - ESAI: Efficient Split Artificial Intelligence via Early Exiting Using
Neural Architecture Search [6.316693022958222]
Deep neural networks have been outperforming conventional machine learning algorithms in many computer vision-related tasks.
The majority of devices are harnessing the cloud computing methodology in which outstanding deep learning models are responsible for analyzing the data on the server.
In this paper, a new framework for deploying on IoT devices has been proposed which can take advantage of both the cloud and the on-device models.
arXiv Detail & Related papers (2021-06-21T04:47:53Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - A character representation enhanced on-device Intent Classification [2.2625832119364153]
We present a novel light-weight architecture for intent classification that can run efficiently on a device.
Our experiments prove that our proposed model outperforms existing approaches and achieves state-of-the-art results on benchmark datasets.
Our model has tiny memory footprint of 5 MB and low inference time of 2 milliseconds, which proves its efficiency in a resource-constrained environment.
arXiv Detail & Related papers (2021-01-12T13:02:05Z) - FRDet: Balanced and Lightweight Object Detector based on Fire-Residual
Modules for Embedded Processor of Autonomous Driving [0.0]
We propose a lightweight one-stage object detector that is balanced to satisfy all the constraints of accuracy, model size, and real-time processing.
Our network aims to maximize the compression of the model while achieving or surpassing YOLOv3 level of accuracy.
arXiv Detail & Related papers (2020-11-16T16:15:43Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Self-Attention Networks for Intent Detection [0.9023847175654603]
We present a novel intent detection system based on a self-attention network and a Bi-LSTM.
Our approach shows improvement by using a transformer model and deep averaging network-based universal sentence encoder.
We evaluate the system on Snips, Smart Speaker, Smart Lights, and ATIS datasets by different evaluation metrics.
arXiv Detail & Related papers (2020-06-28T12:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.