Building the Intent Landscape of Real-World Conversational Corpora with
Extractive Question-Answering Transformers
- URL: http://arxiv.org/abs/2208.12886v2
- Date: Tue, 30 Aug 2022 16:03:38 GMT
- Title: Building the Intent Landscape of Real-World Conversational Corpora with
Extractive Question-Answering Transformers
- Authors: Jean-Philippe Corbeil, Mia Taige Li, Hadi Abdi Ghavidel
- Abstract summary: We propose an unsupervised pipeline that extracts intents and the taxonomy of intents from real-world dialogues.
Our results demonstrate the generalization ability of an ELECTRA large model fine-tuned on the SQuAD2 dataset to understand dialogues.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: For companies with customer service, mapping intents inside their
conversational data is crucial in building applications based on natural
language understanding (NLU). Nevertheless, there is no established automated
technique to gather the intents from noisy online chats or voice transcripts.
Simple clustering approaches are not suited to intent-sparse dialogues. To
solve this intent-landscape task, we propose an unsupervised pipeline that
extracts the intents and the taxonomy of intents from real-world dialogues. Our
pipeline mines intent-span candidates with an extractive Question-Answering
Electra model and leverages sentence embeddings to apply a low-level density
clustering followed by a top-level hierarchical clustering. Our results
demonstrate the generalization ability of an ELECTRA large model fine-tuned on
the SQuAD2 dataset to understand dialogues. With the right prompting question,
this model achieves a rate of linguistic validation on intent spans beyond 85%.
We furthermore reconstructed the intent schemes of five domains from the
MultiDoGo dataset with an average recall of 94.3%.
Related papers
- Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification [6.459396785817196]
Chain-of-Intent generates intent-driven conversations through self-play.
MINT-CL is a framework for multi-turn intent classification using multi-task contrastive learning.
We release MINT-E, a multilingual, intent-aware multi-turn e-commerce dialogue corpus.
arXiv Detail & Related papers (2024-11-21T15:59:29Z) - Improved intent classification based on context information using a windows-based approach [0.0]
The intent classification task aims at identifying what a user is attempting to achieve from an utterance.
Previous works use only the current utterance to predict the intent of a given query.
We propose several approaches to investigate the role of contextual information for the intent classification task.
arXiv Detail & Related papers (2024-11-09T00:56:02Z) - Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - Tri-level Joint Natural Language Understanding for Multi-turn
Conversational Datasets [5.3361357265365035]
We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.
We evaluate our model on two multi-turn datasets for which we are the first to conduct joint slot-filling and intent detection.
arXiv Detail & Related papers (2023-05-28T13:59:58Z) - Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue
Embeddings [41.79937481022846]
We introduce the task of learning unsupervised dialogue embeddings.
Trivial approaches such as combining pre-trained word or sentence embeddings and encoding through pre-trained language models have been shown to be feasible.
We propose a self-guided contrastive learning approach named dial2vec.
arXiv Detail & Related papers (2022-10-27T11:14:06Z) - Weakly Supervised Data Augmentation Through Prompting for Dialogue
Understanding [103.94325597273316]
We present a novel approach that iterates on augmentation quality by applying weakly-supervised filters.
We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue.
For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.
arXiv Detail & Related papers (2022-10-25T17:01:30Z) - End-to-end Spoken Conversational Question Answering: Task, Dataset and
Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts.
We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows.
Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z) - Structure Extraction in Task-Oriented Dialogues with Slot Clustering [94.27806592467537]
In task-oriented dialogues, dialogue structure has often been considered as transition graphs among dialogue states.
We propose a simple yet effective approach for structure extraction in task-oriented dialogues.
arXiv Detail & Related papers (2022-02-28T20:18:12Z) - Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring [60.55025339250815]
We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling.
We take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context from these responses and feed them as additional speaker-specific context to our network to score a particular response.
arXiv Detail & Related papers (2021-08-30T07:00:28Z) - Intent Mining from past conversations for conversational agent [1.9754522186574608]
Bots are increasingly being deployed to provide round-the-clock support and to increase customer engagement.
Many of the commercial bot building frameworks follow a standard approach that requires one to build and train an intent model to recognize a user input.
We have introduced a novel density-based clustering algorithm ITERDB-LabelSCAN for unbalanced data clustering.
arXiv Detail & Related papers (2020-05-22T05:29:13Z) - IART: Intent-aware Response Ranking with Transformers in
Information-seeking Conversation Systems [80.0781718687327]
We analyze user intent patterns in information-seeking conversations and propose an intent-aware neural response ranking model "IART"
IART is built on top of the integration of user intent modeling and language representation learning with the Transformer architecture.
arXiv Detail & Related papers (2020-02-03T05:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.