Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
- URL: http://arxiv.org/abs/2404.02588v1
- Date: Wed, 3 Apr 2024 09:13:26 GMT
- Title: Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
- Authors: Jakub Hoscilowicz, Pawel Pawlowski, Marcin Skorupa, Marcin SowaĆski, Artur Janicki,
- Abstract summary: Spoken Language Understanding (SLU) models are a core component of voice assistants (VA), such as Alexa, Bixby, and Google Assistant.
In this paper, we introduce a pipeline designed to extend SLU systems to new languages, utilizing Large Language Models (LLMs)
Our approach improved on the MultiATIS++ benchmark, a primary multi-language SLU dataset, in the cloud scenario using an mBERT model.
- Score: 0.20971479389679337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spoken Language Understanding (SLU) models are a core component of voice assistants (VA), such as Alexa, Bixby, and Google Assistant. In this paper, we introduce a pipeline designed to extend SLU systems to new languages, utilizing Large Language Models (LLMs) that we fine-tune for machine translation of slot-annotated SLU training data. Our approach improved on the MultiATIS++ benchmark, a primary multi-language SLU dataset, in the cloud scenario using an mBERT model. Specifically, we saw an improvement in the Overall Accuracy metric: from 53% to 62.18%, compared to the existing state-of-the-art method, Fine and Coarse-grained Multi-Task Learning Framework (FC-MTLF). In the on-device scenario (tiny and not pretrained SLU), our method improved the Overall Accuracy from 5.31% to 22.06% over the baseline Global-Local Contrastive Learning Framework (GL-CLeF) method. Contrary to both FC-MTLF and GL-CLeF, our LLM-based machine translation does not require changes in the production architecture of SLU. Additionally, our pipeline is slot-type independent: it does not require any slot definitions or examples.
Related papers
- Fine-tuning multilingual language models in Twitter/X sentiment analysis: a study on Eastern-European V4 languages [0.0]
We focus on ABSA subtasks based on Twitter/X data in underrepresented languages.
We fine-tune several LLMs for classification of sentiment towards Russia and Ukraine.
We document several interesting phenomena demonstrating, among others, that some models are much better fine-tunable on multilingual Twitter tasks than others.
arXiv Detail & Related papers (2024-08-04T14:35:30Z) - Do Large Language Model Understand Multi-Intent Spoken Language ? [5.494472119991781]
This research signifies a considerable breakthrough in leveraging Large Language Models (LLMs) for multi-intent spoken language understanding (SLU)
Our approach re-imagines the use of entity slots in multi-intent SLU applications.
We introduce the concept of Sub-Intent Instruction (SII) to amplify the analysis and interpretation of complex, multi-intent communications.
arXiv Detail & Related papers (2024-03-07T13:30:52Z) - Embracing Language Inclusivity and Diversity in CLIP through Continual
Language Learning [58.92843729869586]
Vision-language pre-trained models (VL-PTMs) have advanced multimodal research in recent years, but their mastery in a few languages like English restricts their applicability in broader communities.
We propose to extend VL-PTMs' language capacity by continual language learning (CLL), where a model needs to update its linguistic knowledge incrementally without suffering from catastrophic forgetting (CF)
We construct a CLL benchmark covering 36 languages based on MSCOCO and XM3600 datasets and then evaluate multilingual image-text retrieval performance.
arXiv Detail & Related papers (2024-01-30T17:14:05Z) - GlotLID: Language Identification for Low-Resource Languages [51.38634652914054]
GlotLID-M is an LID model that satisfies the desiderata of wide coverage, reliability and efficiency.
It identifies 1665 languages, a large increase in coverage compared to prior work.
arXiv Detail & Related papers (2023-10-24T23:45:57Z) - OpenSLU: A Unified, Modularized, and Extensible Toolkit for Spoken
Language Understanding [57.48730496422474]
Spoken Language Understanding (SLU) is one of the core components of a task-oriented dialogue system.
OpenSLU is an open-source toolkit to provide a unified, modularized, and toolkit for spoken language understanding.
arXiv Detail & Related papers (2023-05-17T14:12:29Z) - Finstreder: Simple and fast Spoken Language Understanding with Finite
State Transducers using modern Speech-to-Text models [69.35569554213679]
In Spoken Language Understanding (SLU) the task is to extract important information from audio commands.
This paper presents a simple method for embedding intents and entities into Finite State Transducers.
arXiv Detail & Related papers (2022-06-29T12:49:53Z) - Multi-level Contrastive Learning for Cross-lingual Spoken Language
Understanding [90.87454350016121]
We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels.
We develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer.
arXiv Detail & Related papers (2022-05-07T13:44:28Z) - ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet [95.39817519115394]
ESPnet-SLU is a project inside end-to-end speech processing toolkit, ESPnet.
It is designed for quick development of spoken language understanding in a single framework.
arXiv Detail & Related papers (2021-11-29T17:05:49Z) - From Masked Language Modeling to Translation: Non-English Auxiliary
Tasks Improve Zero-shot Spoken Language Understanding [24.149299722716155]
We introduce xSID, a new benchmark for cross-lingual Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect.
We propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer.
Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.
arXiv Detail & Related papers (2021-05-15T23:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.