A BiRGAT Model for Multi-intent Spoken Language Understanding with
Hierarchical Semantic Frames
- URL: http://arxiv.org/abs/2402.18258v1
- Date: Wed, 28 Feb 2024 11:39:26 GMT
- Title: A BiRGAT Model for Multi-intent Spoken Language Understanding with
Hierarchical Semantic Frames
- Authors: Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu
Chen and Kai Yu
- Abstract summary: We first propose a Multi-Intent dataset which is collected from a realistic in-Vehicle dialogue System, called MIVS.
The target semantic frame is organized in a 3-layer hierarchical structure to tackle the alignment and assignment problems in multi-intent cases.
We devise a BiRGAT model to encode the hierarchy of items, the backbone of which is a dual relational graph attention network.
- Score: 30.200413352223347
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work on spoken language understanding (SLU) mainly focuses on
single-intent settings, where each input utterance merely contains one user
intent. This configuration significantly limits the surface form of user
utterances and the capacity of output semantics. In this work, we first propose
a Multi-Intent dataset which is collected from a realistic in-Vehicle dialogue
System, called MIVS. The target semantic frame is organized in a 3-layer
hierarchical structure to tackle the alignment and assignment problems in
multi-intent cases. Accordingly, we devise a BiRGAT model to encode the
hierarchy of ontology items, the backbone of which is a dual relational graph
attention network. Coupled with the 3-way pointer-generator decoder, our method
outperforms traditional sequence labeling and classification-based schemes by a
large margin.
Related papers
- A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents [12.62162175115002]
This study addresses three critical tasks: extracting multiple intent spans from queries, detecting multiple intents, and developing a multi-lingual intent dataset.
We introduce a novel multi-label multi-class intent detection dataset (MLMCID-dataset) curated from existing benchmark datasets.
We also propose a pointer network-based architecture (MLMCID) to extract intent spans and detect multiple intents with coarse and fine-grained labels in the form of sextuplets.
arXiv Detail & Related papers (2024-10-29T19:10:12Z) - Segment Any 3D Object with Language [58.471327490684295]
We introduce Segment any 3D Object with LanguagE (SOLE), a semantic geometric and-aware visual-language learning framework with strong generalizability.
Specifically, we propose a multimodal fusion network to incorporate multimodal semantics in both backbone and decoder.
Our SOLE outperforms previous methods by a large margin on ScanNetv2, ScanNet200, and Replica benchmarks.
arXiv Detail & Related papers (2024-04-02T17:59:10Z) - Towards Realistic Zero-Shot Classification via Self Structural Semantic
Alignment [53.2701026843921]
Large-scale pre-trained Vision Language Models (VLMs) have proven effective for zero-shot classification.
In this paper, we aim at a more challenging setting, Realistic Zero-Shot Classification, which assumes no annotation but instead a broad vocabulary.
We propose the Self Structural Semantic Alignment (S3A) framework, which extracts structural semantic information from unlabeled data while simultaneously self-learning.
arXiv Detail & Related papers (2023-08-24T17:56:46Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - Dialogue Meaning Representation for Task-Oriented Dialogue Systems [51.91615150842267]
We propose Dialogue Meaning Representation (DMR), a flexible and easily extendable representation for task-oriented dialogue.
Our representation contains a set of nodes and edges with inheritance hierarchy to represent rich semantics for compositional semantics and task-specific concepts.
We propose two evaluation tasks to evaluate different machine learning based dialogue models, and further propose a novel coreference resolution model GNNCoref for the graph-based coreference resolution task.
arXiv Detail & Related papers (2022-04-23T04:17:55Z) - A Template-guided Hybrid Pointer Network for
Knowledge-basedTask-oriented Dialogue Systems [15.654119998970499]
We propose a template-guided hybrid pointer network for the knowledge-based task-oriented dialogue system.
We design a memory pointer network model with a gating mechanism to fully exploit the semantic correlation between the retrieved answers and the ground-truth response.
arXiv Detail & Related papers (2021-06-10T15:49:26Z) - Recurrent Neural Networks with Mixed Hierarchical Structures for Natural
Language Processing [13.960152426268767]
Hierarchical structures exist in both linguistics and Natural Language Processing (NLP) tasks.
How to design RNNs to learn hierarchical representations of natural languages remains a long-standing challenge.
In this paper, we define two different types of boundaries referred to as static and dynamic boundaries, respectively, and then use them to construct a multi-layer hierarchical structure for document classification tasks.
arXiv Detail & Related papers (2021-06-04T15:50:42Z) - Automatic Intent-Slot Induction for Dialogue Systems [5.6195418981579435]
We propose a new task of em automatic intent-slot induction and propose a novel domain-independent tool.
That is, we design a coarse-to-fine three-step procedure including role-labeling, Concept-mining, And Pattern-mining.
We show that our RCAP can generate satisfactory SLU schema and outperforms the state-of-the-art supervised learning method.
arXiv Detail & Related papers (2021-03-16T07:21:31Z) - AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent
Detection and Slot Filling [69.59096090788125]
In this paper, we propose an Adaptive Graph-Interactive Framework (AGIF) for joint multiple intent detection and slot filling.
We introduce an intent-slot graph interaction layer to model the strong correlation between the slot and intents.
Such an interaction layer is applied to each token adaptively, which has the advantage to automatically extract the relevant intents information.
arXiv Detail & Related papers (2020-04-21T15:07:34Z) - MA-DST: Multi-Attention Based Scalable Dialog State Tracking [13.358314140896937]
Dialog State Tracking dialog agents provide a natural language interface for users to complete their goal.
To enable accurate multi-domain DST, the model needs to encode dependencies between past utterances and slot semantics.
We introduce a novel architecture for this task to encode the conversation history and slot semantics.
arXiv Detail & Related papers (2020-02-07T05:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.