Related papers: LIDSNet: A Lightweight on-device Intent Detection model using Deep Siamese Network

LIDSNet: A Lightweight on-device Intent Detection model using Deep Siamese Network

URL: http://arxiv.org/abs/2110.15717v1
Date: Wed, 6 Oct 2021 18:20:37 GMT
Title: LIDSNet: A Lightweight on-device Intent Detection model using Deep Siamese Network
Authors: Vibhav Agarwal, Sudeep Deepak Shivnikar, Sourav Ghosh, Himanshu Arora, Yashwant Saini
Abstract summary: LIDSNet is a novel lightweight on-device intent detection model. We show that our model is at least 41x lighter and 30x faster during inference than MobileBERT on Samsung Galaxy S20 device.
Score: 2.624902795082451
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Intent detection is a crucial task in any Natural Language Understanding (NLU) system and forms the foundation of a task-oriented dialogue system. To build high-quality real-world conversational solutions for edge devices, there is a need for deploying intent detection model on device. This necessitates a light-weight, fast, and accurate model that can perform efficiently in a resource-constrained environment. To this end, we propose LIDSNet, a novel lightweight on-device intent detection model, which accurately predicts the message intent by utilizing a Deep Siamese Network for learning better sentence representations. We use character-level features to enrich the sentence-level representations and empirically demonstrate the advantage of transfer learning by utilizing pre-trained embeddings. Furthermore, to investigate the efficacy of the modules in our architecture, we conduct an ablation study and arrive at our optimal model. Experimental results prove that LIDSNet achieves state-of-the-art competitive accuracy of 98.00% and 95.97% on SNIPS and ATIS public datasets respectively, with under 0.59M parameters. We further benchmark LIDSNet against fine-tuned BERTs and show that our model is at least 41x lighter and 30x faster during inference than MobileBERT on Samsung Galaxy S20 device, justifying its efficiency on resource-constrained edge devices.

Related papers

EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models [64.18350535770357]
We propose an automatic pruning method for large vision-language models to enhance the efficiency of multimodal reasoning. Our approach only leverages a small number of samples to search for the desired pruning policy. We conduct extensive experiments on the ScienceQA, Vizwiz, MM-vet, and LLaVA-Bench datasets for the task of visual question answering.
arXiv Detail & Related papers (2025-03-19T16:07:04Z)
SlimLM: An Efficient Small Language Model for On-Device Document Assistance [60.971107009492606]
We present SlimLM, a series of SLMs optimized for document assistance tasks on mobile devices. SlimLM is pre-trained on SlimPajama-627B and fine-tuned on DocAssist. We evaluate SlimLM against existing SLMs, showing comparable or superior performance.
arXiv Detail & Related papers (2024-11-15T04:44:34Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Efficient Federated Intrusion Detection in 5G ecosystem using optimized BERT-based model [0.7100520098029439]
5G offers advanced services, supporting applications such as intelligent transportation, connected healthcare, and smart cities within the Internet of Things (IoT) These advancements introduce significant security challenges, with increasingly sophisticated cyber-attacks. This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs)
arXiv Detail & Related papers (2024-09-28T15:56:28Z)
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z)
Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices. The impact of on-device storage on the performance of FL is still not explored. In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z)
ESAI: Efficient Split Artificial Intelligence via Early Exiting Using Neural Architecture Search [6.316693022958222]
Deep neural networks have been outperforming conventional machine learning algorithms in many computer vision-related tasks. The majority of devices are harnessing the cloud computing methodology in which outstanding deep learning models are responsible for analyzing the data on the server. In this paper, a new framework for deploying on IoT devices has been proposed which can take advantage of both the cloud and the on-device models.
arXiv Detail & Related papers (2021-06-21T04:47:53Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
A character representation enhanced on-device Intent Classification [2.2625832119364153]
We present a novel light-weight architecture for intent classification that can run efficiently on a device. Our experiments prove that our proposed model outperforms existing approaches and achieves state-of-the-art results on benchmark datasets. Our model has tiny memory footprint of 5 MB and low inference time of 2 milliseconds, which proves its efficiency in a resource-constrained environment.
arXiv Detail & Related papers (2021-01-12T13:02:05Z)
FRDet: Balanced and Lightweight Object Detector based on Fire-Residual Modules for Embedded Processor of Autonomous Driving [0.0]
We propose a lightweight one-stage object detector that is balanced to satisfy all the constraints of accuracy, model size, and real-time processing. Our network aims to maximize the compression of the model while achieving or surpassing YOLOv3 level of accuracy.
arXiv Detail & Related papers (2020-11-16T16:15:43Z)
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity. We present a discriminative nearest neighbor classification with deep self-attention. We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z)
Self-Attention Networks for Intent Detection [0.9023847175654603]
We present a novel intent detection system based on a self-attention network and a Bi-LSTM. Our approach shows improvement by using a transformer model and deep averaging network-based universal sentence encoder. We evaluate the system on Snips, Smart Speaker, Smart Lights, and ATIS datasets by different evaluation metrics.
arXiv Detail & Related papers (2020-06-28T12:19:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.