Which Kind Is Better in Open-domain Multi-turn Dialog,Hierarchical or
Non-hierarchical Models? An Empirical Study
- URL: http://arxiv.org/abs/2008.02964v1
- Date: Fri, 7 Aug 2020 02:54:55 GMT
- Title: Which Kind Is Better in Open-domain Multi-turn Dialog,Hierarchical or
Non-hierarchical Models? An Empirical Study
- Authors: Tian Lan, Xian-Ling Mao, Wei Wei, Heyan Huang
- Abstract summary: There are two kinds of models for open-domain multi-turn dialog generation: hierarchical and non-hierarchical models.
In this paper, we will measure systematically nearly all representative hierarchical and non-hierarchical models over the same experimental settings to check which kind is better.
The excellent performance of HRAN mainly depends on its word-level attention mechanism.
- Score: 52.66393833841219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Currently, open-domain generative dialog systems have attracted considerable
attention in academia and industry. Despite the success of single-turn dialog
generation, multi-turn dialog generation is still a big challenge. So far,
there are two kinds of models for open-domain multi-turn dialog generation:
hierarchical and non-hierarchical models. Recently, some works have shown that
the hierarchical models are better than non-hierarchical models under their
experimental settings; meanwhile, some works also demonstrate the opposite
conclusion. Due to the lack of adequate comparisons, it's not clear which kind
of models are better in open-domain multi-turn dialog generation. Thus, in this
paper, we will measure systematically nearly all representative hierarchical
and non-hierarchical models over the same experimental settings to check which
kind is better. Through extensive experiments, we have the following three
important conclusions: (1) Nearly all hierarchical models are worse than
non-hierarchical models in open-domain multi-turn dialog generation, except for
the HRAN model. Through further analysis, the excellent performance of HRAN
mainly depends on its word-level attention mechanism; (2) The performance of
other hierarchical models will also obtain a great improvement if integrating
the word-level attention mechanism into these models. The modified hierarchical
models even significantly outperform the non-hierarchical models; (3) The
reason why the word-level attention mechanism is so powerful for hierarchical
models is because it can leverage context information more effectively,
especially the fine-grained information. Besides, we have implemented all of
the models and already released the codes.
Related papers
- HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models [28.993221775758702]
Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability.
This paper marks a significant advance toward more flexible and comprehensive model merging techniques.
We train policy and value networks using offline sampling of weight vectors, which are then employed for the online optimization of merging strategies.
arXiv Detail & Related papers (2024-09-27T16:31:31Z) - Using Game Play to Investigate Multimodal and Conversational Grounding in Large Multimodal Models [14.878276985702685]
In this paper, we bring a recently developed evaluation paradigm from text models to multimodal models.
We define games that challenge a model's capability to represent a situation from visual information and align such representations through dialogue.
We find that the largest closed models perform rather well on the games that we define, while even the best open-weight models struggle with them.
arXiv Detail & Related papers (2024-06-20T06:56:19Z) - Language Models are General-Purpose Interfaces [109.45478241369655]
We propose to use language models as a general-purpose interface to various foundation models.
A collection of pretrained encoders perceive diverse modalities (such as vision, and language)
We propose a semi-causal language modeling objective to jointly pretrain the interface and the modular encoders.
arXiv Detail & Related papers (2022-06-13T17:34:22Z) - Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic
Cones [64.75766944882389]
We present ConE (Cone Embedding), a KG embedding model that is able to simultaneously model multiple hierarchical as well as non-hierarchical relations in a knowledge graph.
In particular, ConE uses cone containment constraints in different subspaces of the hyperbolic embedding space to capture multiple heterogeneous hierarchies.
Our approach yields new state-of-the-art Hits@1 of 45.3% on WN18RR and 16.1% on DDB14 (0.231 MRR)
arXiv Detail & Related papers (2021-10-28T07:16:08Z) - Hierarchical Modeling for Out-of-Scope Domain and Intent Classification [55.23920796595698]
This paper focuses on out-of-scope intent classification in dialog systems.
We propose a hierarchical multi-task learning approach based on a joint model to classify domain and intent simultaneously.
Experiments show that the model outperforms existing methods in terms of accuracy, out-of-scope recall and F1.
arXiv Detail & Related papers (2021-04-30T06:38:23Z) - Polynomial Networks in Deep Classifiers [55.90321402256631]
We cast the study of deep neural networks under a unifying framework.
Our framework provides insights on the inductive biases of each model.
The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.
arXiv Detail & Related papers (2021-04-16T06:41:20Z) - Controlling Style in Generated Dialogue [13.445455480452484]
We adapt three previously proposed controllable generation architectures to open-domain dialogue generation.
We control the style of the generation to match one among about 200 possible styles.
We show how they can be used to provide insights into existing conversational datasets.
arXiv Detail & Related papers (2020-09-22T23:21:04Z) - Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation.
Previous works usually employ various neural network architectures to model the history.
This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.