LARA: Linguistic-Adaptive Retrieval-Augmented LLMs for Multi-Turn Intent Classification
- URL: http://arxiv.org/abs/2403.16504v1
- Date: Mon, 25 Mar 2024 07:38:40 GMT
- Title: LARA: Linguistic-Adaptive Retrieval-Augmented LLMs for Multi-Turn Intent Classification
- Authors: Liu Junhua, Tan Yong Keat, Fu Bin,
- Abstract summary: We introduce LARA (Linguistic-Adaptive Retrieval-Augmented Language Models), designed to enhance accuracy in multi-turn classification tasks across six languages.
LARA combines a fine-tuned smaller model with a retrieval-augmented mechanism, integrated within the architecture of large language models.
Comprehensive experiments demonstrate that LARA achieves state-of-the-art performance on multi-turn intent classification tasks, enhancing the average accuracy by 3.67% compared to existing methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Following the significant achievements of large language models (LLMs), researchers have employed in-context learning for text classification tasks. However, these studies focused on monolingual, single-turn classification tasks. In this paper, we introduce LARA (Linguistic-Adaptive Retrieval-Augmented Language Models), designed to enhance accuracy in multi-turn classification tasks across six languages, accommodating numerous intents in chatbot interactions. Multi-turn intent classification is notably challenging due to the complexity and evolving nature of conversational contexts. LARA tackles these issues by combining a fine-tuned smaller model with a retrieval-augmented mechanism, integrated within the architecture of LLMs. This integration allows LARA to dynamically utilize past dialogues and relevant intents, thereby improving the understanding of the context. Furthermore, our adaptive retrieval techniques bolster the cross-lingual capabilities of LLMs without extensive retraining and fine-tune. Comprehensive experiments demonstrate that LARA achieves state-of-the-art performance on multi-turn intent classification tasks, enhancing the average accuracy by 3.67% compared to existing methods.
Related papers
- 1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators? [46.43162333819418]
Large Language Models (LLMs) have garnered significant attention due to their remarkable ability to process information across various languages.
Despite their capabilities, they exhibit inconsistencies in handling identical queries in different languages, presenting challenges for further advancement.
This paper introduces a method to enhance the multilingual performance of LLMs by aggregating knowledge from diverse languages.
arXiv Detail & Related papers (2024-06-20T20:32:53Z) - MINERS: Multilingual Language Models as Semantic Retrievers [23.686762008696547]
This paper introduces the MINERS, a benchmark designed to evaluate the ability of multilingual language models in semantic retrieval tasks.
We create a comprehensive framework to assess the robustness of LMs in retrieving samples across over 200 diverse languages.
Our results demonstrate that by solely retrieving semantically similar embeddings yields performance competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-11T16:26:18Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - From Classification to Generation: Insights into Crosslingual Retrieval
Augmented ICL [8.065775937617417]
We introduce a novel approach that leverages cross-lingual retrieval-augmented in-context learning (CREA-ICL)
By extracting semantically similar prompts from high-resource languages, we aim to improve the zero-shot performance of multilingual pre-trained language models (MPLMs)
Though our approach yields steady improvements in classification tasks, it faces challenges in generation tasks.
arXiv Detail & Related papers (2023-11-11T15:40:21Z) - Efficient Spoken Language Recognition via Multilabel Classification [53.662747523872305]
We show that our models obtain competitive results while being orders of magnitude smaller and faster than current state-of-the-art methods.
Our multilabel strategy is more robust to unseen non-target languages compared to multiclass classification.
arXiv Detail & Related papers (2023-06-02T23:04:19Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - A Survey of Large Language Models [81.06947636926638]
Language modeling has been widely studied for language understanding and generation in the past two decades.
Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora.
To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.
arXiv Detail & Related papers (2023-03-31T17:28:46Z) - A Multi-level Supervised Contrastive Learning Framework for Low-Resource
Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding.
Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z) - From Masked Language Modeling to Translation: Non-English Auxiliary
Tasks Improve Zero-shot Spoken Language Understanding [24.149299722716155]
We introduce xSID, a new benchmark for cross-lingual Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect.
We propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer.
Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.
arXiv Detail & Related papers (2021-05-15T23:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.