Related papers: Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction

Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction

URL: http://arxiv.org/abs/2410.18481v2
Date: Tue, 05 Nov 2024 11:40:07 GMT
Title: Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction
Authors: Sergio Burdisso, Srikanth Madikeri, Petr Motlicek,
Abstract summary: We introduce Dialog2Flow embeddings, which group dialogs according to their communicative and informative functions. By clustering D2F embeddings, the latent space is quantized, and dialogs can be converted into sequences of region/action IDs. We show that D2F yields superior qualitative and quantitative results across diverse domains.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficiently deriving structured workflows from unannotated dialogs remains an underexplored and formidable challenge in computational linguistics. Automating this process could significantly accelerate the manual design of workflows in new domains and enable the grounding of large language models in domain-specific flowcharts, enhancing transparency and controllability. In this paper, we introduce Dialog2Flow (D2F) embeddings, which differ from conventional sentence embeddings by mapping utterances to a latent space where they are grouped according to their communicative and informative functions (i.e., the actions they represent). D2F allows for modeling dialogs as continuous trajectories in a latent space with distinct action-related regions. By clustering D2F embeddings, the latent space is quantized, and dialogs can be converted into sequences of region/action IDs, facilitating the extraction of the underlying workflow. To pre-train D2F, we build a comprehensive dataset by unifying twenty task-oriented dialog datasets with normalized per-turn action annotations. We also introduce a novel soft contrastive loss that leverages the semantic information of these actions to guide the representation learning process, showing superior performance compared to standard supervised contrastive loss. Evaluation against various sentence embeddings, including dialog-specific ones, demonstrates that D2F yields superior qualitative and quantitative results across diverse domains.

Related papers

On Mitigating Data Sparsity in Conversational Recommender Systems [69.70761335240738]
Conversational recommender systems (CRSs) capture user preference through textual information in dialogues.<n>They suffer from data sparsity on two fronts: the dialogue space is vast and linguistically diverse, while the item space exhibits long-tail and sparse distributions.<n>Existing methods struggle with (1) generalizing to varied dialogue expressions due to underutilization of rich textual cues, and (2) learning informative item representations under severe sparsity.
arXiv Detail & Related papers (2025-07-01T06:54:51Z)
Semantic-Space-Intervened Diffusive Alignment for Visual Classification [11.621655970763467]
Cross-modal alignment is an effective approach to improving visual classification.<n>This paper proposes a novel Semantic-Space-Intervened Diffusive Alignment method, termed SeDA.<n> Experimental results show that SeDA achieves stronger cross-modal feature alignment, leading to superior performance over existing methods.
arXiv Detail & Related papers (2025-05-09T01:41:23Z)
EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting [108.15136508964011]
EgoSplat is a language-embedded 3D Gaussian Splatting framework for open-vocabulary egocentric scene understanding. EgoSplat achieves state-of-the-art performance in both localization and segmentation tasks on two datasets.
arXiv Detail & Related papers (2025-03-14T12:21:26Z)
Lifting Scheme-Based Implicit Disentanglement of Emotion-Related Facial Dynamics in the Wild [3.3905929183808796]
In-the-wild dynamic facial expression recognition (DFER) encounters a significant challenge in recognizing emotion-related expressions. We propose a novel Implicit Facial Dynamics Disentanglement framework (IFDD) IFDD disentangles emotion-related dynamic information from emotion-irrelevant global context in an implicit manner.
arXiv Detail & Related papers (2024-12-17T18:45:53Z)
Towards Automatic Evaluation of Task-Oriented Dialogue Flows [5.146847146797646]
We introduce FuDGE (Fuzzy Dialogue-Graph Edit Distance), a novel metric for evaluating the quality of dialogue flows. FuDGE measures how well individual conversations align with a flow and, consequently, how well a set of conversations is represented by the flow overall. By standardizing and optimizing dialogue flows, FuDGE enables conversational designers and automated techniques to achieve higher levels of efficiency and automation.
arXiv Detail & Related papers (2024-11-15T18:35:00Z)
Unsupervised Flow Discovery from Task-oriented Dialogues [0.988655456942026]
We propose an approach for the unsupervised discovery of flows from dialogue history. We present concrete examples of flows, discovered from MultiWOZ, a public TOD dataset.
arXiv Detail & Related papers (2024-05-02T15:54:36Z)
TOD-Flow: Modeling the Structure of Task-Oriented Dialogues [77.15457469745364]
We propose a novel approach focusing on inferring the TOD-Flow graph from dialogue data annotated with dialog acts. The inferred TOD-Flow graph can be easily integrated with any dialogue model to improve its prediction performance, transparency, and controllability.
arXiv Detail & Related papers (2023-12-07T20:06:23Z)
Zero-Shot Generalizable End-to-End Task-Oriented Dialog System using Context Summarization and Domain Schema [2.7178968279054936]
State-of-the-art approaches in task-oriented dialog systems formulate the problem as a conditional sequence generation task. This requires labeled training data for each new domain or task. We introduce a novel Zero-Shot generalizable end-to-end Task-oriented Dialog system, ZS-ToD.
arXiv Detail & Related papers (2023-03-28T18:56:31Z)
SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora. Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z)
Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system. We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals. Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z)
DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification [77.59549450705384]
In dialog system, dialog act recognition and sentiment classification are two correlative tasks. Most of the existing systems either treat them as separate tasks or just jointly model the two tasks. We propose a Deep Co-Interactive Relation Network (DCR-Net) to explicitly consider the cross-impact and model the interaction between the two tasks.
arXiv Detail & Related papers (2020-08-16T14:13:32Z)
Video-Grounded Dialogues with Pretrained Generation Language Models [88.15419265622748]
We leverage the power of pre-trained language models for improving video-grounded dialogue. We propose a framework by formulating sequence-to-grounded dialogue tasks as a sequence-to-grounded task. Our framework allows fine-tuning language models to capture dependencies across multiple modalities.
arXiv Detail & Related papers (2020-06-27T08:24:26Z)
Non-Autoregressive Dialog State Tracking [122.2328875457225]
We propose a novel framework of Non-Autoregressive Dialog State Tracking (NADST) NADST can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots. Our results show that our model achieves the state-of-the-art joint accuracy across all domains on the MultiWOZ 2.1 corpus.
arXiv Detail & Related papers (2020-02-19T06:39:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.