A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents
- URL: http://arxiv.org/abs/2410.22476v1
- Date: Tue, 29 Oct 2024 19:10:12 GMT
- Title: A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents
- Authors: Ankan Mullick, Sombit Bose, Abhilash Nandy, Gajula Sai Chaitanya, Pawan Goyal,
- Abstract summary: This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual intent dataset.
We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets.
We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets.
- Score: 12.62162175115002
- License:
- Abstract: In task-oriented dialogue systems, intent detection is crucial for interpreting user queries and providing appropriate responses. Existing research primarily addresses simple queries with a single intent, lacking effective systems for handling complex queries with multiple intents and extracting different intent spans. Additionally, there is a notable absence of multilingual, multi-intent datasets. This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual multi-label intent dataset. We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets. We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets. Comprehensive analysis demonstrates the superiority of our pointer network-based system over baseline approaches in terms of accuracy and F1-score across various datasets.
Related papers
- A BiRGAT Model for Multi-intent Spoken Language Understanding with
Hierarchical Semantic Frames [30.200413352223347]
We first propose a Multi-Intent dataset which is collected from a realistic in-Vehicle dialogue System, called MIVS.
The target semantic frame is organized in a 3-layer hierarchical structure to tackle the alignment and assignment problems in multi-intent cases.
We devise a BiRGAT model to encode the hierarchy of items, the backbone of which is a dual relational graph attention network.
arXiv Detail & Related papers (2024-02-28T11:39:26Z) - IntenDD: A Unified Contrastive Learning Approach for Intent Detection
and Discovery [12.905097743551774]
We propose IntenDD, a unified approach leveraging a shared utterance encoding backbone.
IntenDD uses an entirely unsupervised contrastive learning strategy for representation learning.
We find that our approach consistently outperforms competitive baselines across all three tasks.
arXiv Detail & Related papers (2023-10-25T16:50:24Z) - Multi-label affordance mapping from egocentric vision [3.683202928838613]
We present a new approach to affordance perception which enables accurate multi-label segmentation.
Our approach can be used to automatically extract grounded affordances from first person videos.
We show how our metric representation can be exploited for build a map of interaction hotspots.
arXiv Detail & Related papers (2023-09-05T10:56:23Z) - MIntRec: A New Dataset for Multimodal Intent Recognition [18.45381778273715]
Multimodal intent recognition is a significant task for understanding human language in real-world multimodal scenes.
This paper introduces a novel dataset for multimodal intent recognition (MIntRec) to address this issue.
It formulates coarse-grained and fine-grained intent based on the data collected from the TV series Superstore.
arXiv Detail & Related papers (2022-09-09T15:37:39Z) - Detection Hub: Unifying Object Detection Datasets via Query Adaptation
on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned.
It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets.
The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z) - Multi-Modal Few-Shot Object Detection with Meta-Learning-Based
Cross-Modal Prompting [77.69172089359606]
We study multi-modal few-shot object detection (FSOD) in this paper, using both few-shot visual examples and class semantic information for detection.
Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning.
We comprehensively evaluate the proposed multi-modal FSOD models on multiple few-shot object detection benchmarks, achieving promising results.
arXiv Detail & Related papers (2022-04-16T16:45:06Z) - Exploring the Limits of Natural Language Inference Based Setup for
Few-Shot Intent Detection [13.971616443394474]
Generalized Few-shot intent detection is more realistic but challenging setup.
We employ a simple and effective method based on Natural Language Inference.
Our method achieves state-of-the-art results on 1-shot and 5-shot intent detection task.
arXiv Detail & Related papers (2021-12-14T14:47:23Z) - Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets.
We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy.
Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z) - Few-shot Learning for Multi-label Intent Detection [59.66787898744991]
State-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels.
Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.
arXiv Detail & Related papers (2020-10-11T14:42:18Z) - AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent
Detection and Slot Filling [69.59096090788125]
In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling.
We introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents.
Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information.
arXiv Detail & Related papers (2020-04-21T15:07:34Z) - FairMOT: On the Fairness of Detection and Re-Identification in Multiple
Object Tracking [92.48078680697311]
Multi-object tracking (MOT) is an important problem in computer vision.
We present a simple yet effective approach termed as FairMOT based on the anchor-free object detection architecture CenterNet.
The approach achieves high accuracy for both detection and tracking.
arXiv Detail & Related papers (2020-04-04T08:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.