Related papers: IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based Correction

IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based Correction

URL: http://arxiv.org/abs/2511.05921v1
Date: Sat, 08 Nov 2025 08:32:59 GMT
Title: IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based Correction
Authors: Ankan Mullick, Sukannya Purkayastha, Saransh Sharma, Pawan Goyal, Niloy Ganguly,
Abstract summary: IDALC is a semi-supervised framework designed to detect user intents and rectify system-rejected utterances.<n>We maintain the overall annotation cost at just 6-10% of the unlabelled data available to the system.
Score: 29.961460339925424
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Voice-controlled dialog systems have become immensely popular due to their ability to perform a wide range of actions in response to diverse user queries. These agents possess a predefined set of skills or intents to fulfill specific user tasks. But every system has its own limitations. There are instances where, even for known intents, if any model exhibits low confidence, it results in rejection of utterances that necessitate manual annotation. Additionally, as time progresses, there may be a need to retrain these agents with new intents from the system-rejected queries to carry out additional tasks. Labeling all these emerging intents and rejected utterances over time is impractical, thus calling for an efficient mechanism to reduce annotation costs. In this paper, we introduce IDALC (Intent Detection and Active Learning based Correction), a semi-supervised framework designed to detect user intents and rectify system-rejected utterances while minimizing the need for human annotation. Empirical findings on various benchmark datasets demonstrate that our system surpasses baseline methods, achieving a 5-10% higher accuracy and a 4-8% improvement in macro-F1. Remarkably, we maintain the overall annotation cost at just 6-10% of the unlabelled data available to the system. The overall framework of IDALC is shown in Fig. 1

Related papers

Scalable and Robust LLM Unlearning by Correcting Responses with Retrieved Exclusions [49.55618517046225]
Language models trained on web-scale corpora risk memorizing and exposing sensitive information.<n>We propose Corrective Unlearning with Retrieved Exclusions (CURE), a novel unlearning framework.<n>CURE verifies model outputs for leakage and revises them into safe responses.
arXiv Detail & Related papers (2025-09-30T09:07:45Z)
Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation has emerged as a powerful approach to mitigate large language model hallucinations.<n>Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.<n>We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z)
On the Necessity of World Knowledge for Mitigating Missing Labels in Extreme Classification [17.309987565818577]
Extreme Classification (XC) aims to map a query to the most relevant documents from a very large document set. We observe that systematic missing labels lead to missing knowledge, which is critical for accurately modelling relevance between queries and documents. We propose SKIM, an algorithm that leverages a combination of small LM and abundant unstructured meta-data to effectively mitigate the missing label problem.
arXiv Detail & Related papers (2024-08-18T20:08:42Z)
Improved Out-of-Scope Intent Classification with Dual Encoding and Threshold-based Re-Classification [6.975902383951604]
Current methodologies face difficulties with the unpredictable distribution of outliers. We present the Dual for Threshold-Based Re-Classification (DETER) to address these challenges. Our model outperforms previous benchmarks, increasing up to 13% and 5% in F1 score for known and unknown intents.
arXiv Detail & Related papers (2024-05-30T11:46:42Z)
Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z)
One-bit Supervision for Image Classification: Problem, Solution, and Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification. We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm. In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z)
Template-based Approach to Zero-shot Intent Recognition [7.330908962006392]
In this paper, we explore the generalized zero-shot setup for intent recognition. Following best practices for zero-shot text classification, we treat the task with a sentence pair modeling approach. We outperform previous state-of-the-art f1-measure by up to 16% for unseen intents.
arXiv Detail & Related papers (2022-06-22T08:44:59Z)
A Framework to Generate High-Quality Datapoints for Multiple Novel Intent Detection [24.14668837496296]
MNID is a framework to detect multiple novel intents with budgeted human annotation cost. It outperforms the baseline methods in terms of accuracy and F1-score.
arXiv Detail & Related papers (2022-05-04T11:32:15Z)
FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-Based Point Clouds [93.3595555830426]
We propose FLAVA, a systematic approach to minimizing human interaction in the annotation process. Specifically, we divide the annotation pipeline into four parts: find, localize, adjust and verify. Our system also greatly reduces the amount of interaction by introducing a light-weight yet effective mechanism to propagate the results.
arXiv Detail & Related papers (2020-11-20T02:22:36Z)
Intra-Camera Supervised Person Re-Identification [87.88852321309433]
We propose a novel person re-identification paradigm based on an idea of independent per-camera identity annotation. This eliminates the most time-consuming and tedious inter-camera identity labelling process. We formulate a Multi-tAsk mulTi-labEl (MATE) deep learning method for Intra-Camera Supervised (ICS) person re-id.
arXiv Detail & Related papers (2020-02-12T15:26:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.