Related papers: LANID: LLM-assisted New Intent Discovery

LANID: LLM-assisted New Intent Discovery

URL: http://arxiv.org/abs/2503.23740v1
Date: Mon, 31 Mar 2025 05:34:32 GMT
Title: LANID: LLM-assisted New Intent Discovery
Authors: Lu Fan, Jiashu Pu, Rongsheng Zhang, Xiao-Ming Wu,
Abstract summary: New Intent Discovery (NID) is a crucial task that aims to identify novel intents while maintaining the capability to recognize existing ones.<n>Previous efforts to adapt TODS to new intents have struggled with inadequate semantic representation.<n>We propose LANID, a framework that enhances the semantic representation of lightweight NID encoders with the guidance of Large Language Models.
Score: 18.15557766598695
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Task-oriented Dialogue Systems (TODS) often face the challenge of encountering new intents. New Intent Discovery (NID) is a crucial task that aims to identify these novel intents while maintaining the capability to recognize existing ones. Previous efforts to adapt TODS to new intents have struggled with inadequate semantic representation or have depended on external knowledge, which is often not scalable or flexible. Recently, Large Language Models (LLMs) have demonstrated strong zero-shot capabilities; however, their scale can be impractical for real-world applications that involve extensive queries. To address the limitations of existing NID methods by leveraging LLMs, we propose LANID, a framework that enhances the semantic representation of lightweight NID encoders with the guidance of LLMs. Specifically, LANID employs the $K$-nearest neighbors and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithms to sample selective utterance pairs from the training set. It then queries an LLM to ascertain the relationships between these pairs. The data produced from this process is utilized to design a contrastive fine-tuning task, which is then used to train a small encoder with a contrastive triplet loss. Our experimental results demonstrate the efficacy of the proposed method across three distinct NID datasets, surpassing strong baselines in both unsupervised and semi-supervised settings. Our code is available at https://github.com/floatSDSDS/LANID.

Related papers

Enhancing LLM-based Recommendation through Semantic-Aligned Collaborative Knowledge [25.757451106327167]
SeLLa-Rec focuses on achieving alignment between the semantic spaces of Collabs. and LLMs. This alignment fosters effective knowledge fusion, mitigating the influence of discriminative noise. Experiments conducted on two public benchmark datasets demonstrate that SeLLa-Rec achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-04-14T11:15:30Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks. However, they still struggle with problems requiring multi-step decision-making and environmental feedback. We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Leveraging Large Language Models for Wireless Symbol Detection via In-Context Learning [29.28683810366379]
We propose to leverage the in-context learning ability (a.k.a. prompting) of large language models (LLMs) to solve wireless tasks in the low data regime without any training or fine-tuning. Our results reveal that using LLMs via ICL methods generally outperforms traditional DNNs on the symbol demodulation task.
arXiv Detail & Related papers (2024-08-28T17:19:20Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
Get my drift? Catching LLM Task Drift with Activation Deltas [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users.<n>We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set.<n>We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z)
UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models [12.45822383965784]
We introduce UnDIAL (Unlearning via Self-Distillation on Adjusted Logits), a novel and robust unlearning method. Our approach leverages self-distillation to adjust logits and selectively reduce the influence of targeted tokens.
arXiv Detail & Related papers (2024-02-15T16:21:14Z)
RA-Rec: An Efficient ID Representation Alignment Framework for LLM-based Recommendation [9.606111709136675]
We present RA-Rec, an efficient ID representation framework for LLM-based recommendation. RA-Rec substantially outperforms current state-of-the-art methods, achieving up to 3.0% absolute HitRate@100 improvements.
arXiv Detail & Related papers (2024-02-07T02:14:58Z)
O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models [16.91329676173649]
Offline Data-driven Discovery and Distillation (O3D) is proposed to improve large language models (LLMs) O3D automatically discovers reusable skills and distills generalizable knowledge across multiple tasks based on offline interaction data. Empirical results under two interactive decision-making benchmarks (ALFWorld and WebShop) verify that O3D can notably enhance the decision-making capabilities of LLMs.
arXiv Detail & Related papers (2023-10-22T20:28:33Z)
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs. We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z)
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach [31.6589518077397]
Large language models (LLMs) encode a vast amount of world knowledge acquired from massive text datasets. LLMs can assist an embodied agent in solving complex sequential decision making tasks by providing high-level instructions. We propose When2Ask, a reinforcement learning based approach that learns when it is necessary to query LLMs for high-level instructions.
arXiv Detail & Related papers (2023-06-06T11:49:09Z)
Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones. We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z)
CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions. We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD. Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.