Related papers: A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents

A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents

URL: http://arxiv.org/abs/2410.22476v1
Date: Tue, 29 Oct 2024 19:10:12 GMT
Title: A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents
Authors: Ankan Mullick, Sombit Bose, Abhilash Nandy, Gajula Sai Chaitanya, Pawan Goyal,
Abstract summary: This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual intent dataset. We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets. We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets.
Score: 12.62162175115002
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: In task-oriented dialogue systems, intent detection is crucial for interpreting user queries and providing appropriate responses. Existing research primarily addresses simple queries with a single intent, lacking effective systems for handling complex queries with multiple intents and extracting different intent spans. Additionally, there is a notable absence of multilingual, multi-intent datasets. This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual multi-label intent dataset. We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets. We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets. Comprehensive analysis demonstrates the superiority of our pointer network-based system over baseline approaches in terms of accuracy and F1-score across various datasets.

Related papers

A Generative Model for Joint Multiple Intent Detection and Slot Filling [3.060720241524644]
In task-oriented dialogue systems, spoken language understanding (SLU) is a critical component, which consists of two sub-tasks, intent detection and slot filling.<n>Most existing methods focus on the single-intent SLU, where each utterance only has one intent.<n>In this paper, we propose a generative framework to simultaneously address multiple intent detection and slot filling.
arXiv Detail & Related papers (2026-02-09T06:52:34Z)
A Multimodal Depth-Aware Method For Embodied Reference Understanding [56.30142869506262]
Embodied Reference Understanding requires identifying a target object in a visual scene based on both language instructions and pointing cues.<n>We propose a novel ERU framework that jointly leverages data augmentation, depth-map modality, and a depth-aware decision module.
arXiv Detail & Related papers (2025-10-09T14:32:21Z)
InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition [19.74617806521803]
InstructSAM is a training-free framework for instruction-driven object recognition.<n>We present EarthInstruct, the first InstructCDS benchmark for earth observation.
arXiv Detail & Related papers (2025-05-21T17:59:56Z)
Intent Representation Learning with Large Language Model for Recommendation [11.118517297006894]
We propose a model-agnostic framework, Intent Representation Learning with Large Language Model (IRLLRec), to construct multimodal intents and enhance recommendations. Specifically, IRLLRec employs a dual-tower architecture to learn multimodal intent representations. To better match textual and interaction-based intents, we employ momentum distillation to perform teacher-student learning on fused intent representations.
arXiv Detail & Related papers (2025-02-05T16:08:05Z)
Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification [6.459396785817196]
Chain-of-Intent generates intent-driven conversations through self-play. MINT-CL is a framework for multi-turn intent classification using multi-task contrastive learning. We release MINT-E, a multilingual, intent-aware multi-turn e-commerce dialogue corpus.
arXiv Detail & Related papers (2024-11-21T15:59:29Z)
A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames [30.200413352223347]
We first propose a Multi-Intent dataset which is collected from a realistic in-Vehicle dialogue System, called MIVS. The target semantic frame is organized in a 3-layer hierarchical structure to tackle the alignment and assignment problems in multi-intent cases. We devise a BiRGAT model to encode the hierarchy of items, the backbone of which is a dual relational graph attention network.
arXiv Detail & Related papers (2024-02-28T11:39:26Z)
IntenDD: A Unified Contrastive Learning Approach for Intent Detection and Discovery [12.905097743551774]
We propose IntenDD, a unified approach leveraging a shared utterance encoding backbone. IntenDD uses an entirely unsupervised contrastive learning strategy for representation learning. We find that our approach consistently outperforms competitive baselines across all three tasks.
arXiv Detail & Related papers (2023-10-25T16:50:24Z)
Multi-label affordance mapping from egocentric vision [3.683202928838613]
We present a new approach to affordance perception which enables accurate multi-label segmentation. Our approach can be used to automatically extract grounded affordances from first person videos. We show how our metric representation can be exploited for build a map of interaction hotspots.
arXiv Detail & Related papers (2023-09-05T10:56:23Z)
MIntRec: A New Dataset for Multimodal Intent Recognition [18.45381778273715]
Multimodal intent recognition is a significant task for understanding human language in real-world multimodal scenes. This paper introduces a novel dataset for multimodal intent recognition (MIntRec) to address this issue. It formulates coarse-grained and fine-grained intent based on the data collected from the TV series Superstore.
arXiv Detail & Related papers (2022-09-09T15:37:39Z)
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned. It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets. The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z)
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting [77.69172089359606]
We study multi-modal few-shot object detection (FSOD) in this paper, using both few-shot visual examples and class semantic information for detection. Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning. We comprehensively evaluate the proposed multi-modal FSOD models on multiple few-shot object detection benchmarks, achieving promising results.
arXiv Detail & Related papers (2022-04-16T16:45:06Z)
Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets. We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy. Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)
Few-shot Learning for Multi-label Intent Detection [59.66787898744991]
State-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels. Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.
arXiv Detail & Related papers (2020-10-11T14:42:18Z)
AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling [69.59096090788125]
In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling. We introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents. Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information.
arXiv Detail & Related papers (2020-04-21T15:07:34Z)
FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking [92.48078680697311]
Multi-object tracking (MOT) is an important problem in computer vision. We present a simple yet effective approach termed as FairMOT based on the anchor-free object detection architecture CenterNet. The approach achieves high accuracy for both detection and tracking.
arXiv Detail & Related papers (2020-04-04T08:18:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.