Multi-View Zero-Shot Open Intent Induction from Dialogues: Multi Domain
Batch and Proxy Gradient Transfer
- URL: http://arxiv.org/abs/2303.13099v3
- Date: Sun, 13 Aug 2023 15:06:43 GMT
- Title: Multi-View Zero-Shot Open Intent Induction from Dialogues: Multi Domain
Batch and Proxy Gradient Transfer
- Authors: Hyukhun Koh, Haesung Pyun, Nakyeong Yang, Kyomin Jung
- Abstract summary: In Task Oriented Dialogue (TOD) system, detecting and inducing new intents are two main challenges to apply the system in the real world.
We suggest the semantic multi-view model to resolve these two challenges.
We introduce a novel method PGT, which employs the Siamese network to fine-tune the model with a clustering method directly.
- Score: 16.804434185847363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Task Oriented Dialogue (TOD) system, detecting and inducing new intents
are two main challenges to apply the system in the real world. In this paper,
we suggest the semantic multi-view model to resolve these two challenges: (1)
SBERT for General Embedding (GE), (2) Multi Domain Batch (MDB) for dialogue
domain knowledge, and (3) Proxy Gradient Transfer (PGT) for cluster-specialized
semantic. MDB feeds diverse dialogue datasets to the model at once to tackle
the multi-domain problem by learning the multiple domain knowledge. We
introduce a novel method PGT, which employs the Siamese network to fine-tune
the model with a clustering method directly.Our model can learn how to cluster
dialogue utterances by using PGT. Experimental results demonstrate that our
multi-view model with MDB and PGT significantly improves the Open Intent
Induction performance compared to baseline systems.
Related papers
- SLIDE: A Framework Integrating Small and Large Language Models for Open-Domain Dialogues Evaluation [23.203761925540736]
We propose a novel framework SLIDE (Small and Large Integrated for Dialogue Evaluation)
Our approach achieves state-of-the-art performance in both the classification and evaluation tasks, and additionally the SLIDE exhibits better correlation with human evaluators.
arXiv Detail & Related papers (2024-05-24T20:32:49Z) - ChatterBox: Multi-round Multimodal Referring and Grounding [108.9673313949746]
We present a new benchmark and an efficient vision-language model for this purpose.
The proposed model, named ChatterBox, utilizes a two-branch architecture to collaboratively handle vision and language tasks.
Experiments show that ChatterBox outperforms existing models in MRG both quantitatively and qualitatively.
arXiv Detail & Related papers (2024-01-24T09:02:00Z) - DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever [83.33209603041013]
We propose a parameter-efficient prompt-tuning method named DialCLIP for multi-modal dialog retrieval.
Our approach introduces a multi-modal context generator to learn context features which are distilled into prompts within the pre-trained vision-language model CLIP.
To facilitate various types of retrieval, we also design multiple experts to learn mappings from CLIP outputs to multi-modal representation space.
arXiv Detail & Related papers (2024-01-02T07:40:12Z) - Collaborative Reasoning on Multi-Modal Semantic Graphs for
Video-Grounded Dialogue Generation [53.87485260058957]
We study video-grounded dialogue generation, where a response is generated based on the dialogue context and the associated video.
The primary challenges of this task lie in (1) the difficulty of integrating video data into pre-trained language models (PLMs)
We propose a multi-agent reinforcement learning method to collaboratively perform reasoning on different modalities.
arXiv Detail & Related papers (2022-10-22T14:45:29Z) - GRASP: Guiding model with RelAtional Semantics using Prompt [3.1275060062551208]
We propose a Guiding model with RelAtional Semantics using Prompt (GRASP)
We adopt a prompt-based fine-tuning approach and capture relational semantic clues of a given dialogue with an argument-aware prompt marker strategy.
In the experiments, GRASP state-of-the-art performance in terms of both F1 and F1c scores on a DialogRE dataset.
arXiv Detail & Related papers (2022-08-26T08:19:28Z) - Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system.
We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals.
Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z) - Group Gated Fusion on Attention-based Bidirectional Alignment for
Multimodal Emotion Recognition [63.07844685982738]
This paper presents a new model named as Gated Bidirectional Alignment Network (GBAN), which consists of an attention-based bidirectional alignment network over LSTM hidden states.
We empirically show that the attention-aligned representations outperform the last-hidden-states of LSTM significantly.
The proposed GBAN model outperforms existing state-of-the-art multimodal approaches on the IEMOCAP dataset.
arXiv Detail & Related papers (2022-01-17T09:46:59Z) - Multi-Task Learning for Situated Multi-Domain End-to-End Dialogue
Systems [21.55075825370981]
We leverage multi-task learning techniques to train a GPT-2 based model on a more challenging dataset.
Our method achieves better performance on all sub-tasks, across domains, compared to task and domain-specific models.
arXiv Detail & Related papers (2021-10-11T12:36:30Z) - Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System [26.837972034630003]
PPTOD is a unified plug-and-play model for task-oriented dialogue.
We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification.
arXiv Detail & Related papers (2021-09-29T22:02:18Z) - Meta Dialogue Policy Learning [58.045067703675095]
We propose Deep Transferable Q-Network (DTQN) to utilize shareable low-level signals between domains.
We decompose the state and action representation space into feature subspaces corresponding to these low-level components.
In experiments, our model outperforms baseline models in terms of both success rate and dialogue efficiency.
arXiv Detail & Related papers (2020-06-03T23:53:06Z) - Multi-Domain Dialogue Acts and Response Co-Generation [34.27525685962274]
We propose a neural co-generation model that generates dialogue acts and responses concurrently.
Our model achieves very favorable improvement over several state-of-the-art models in both automatic and human evaluations.
arXiv Detail & Related papers (2020-04-26T12:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.