Learning Reasoning Paths over Semantic Graphs for Video-grounded
Dialogues
- URL: http://arxiv.org/abs/2103.00820v1
- Date: Mon, 1 Mar 2021 07:39:26 GMT
- Title: Learning Reasoning Paths over Semantic Graphs for Video-grounded
Dialogues
- Authors: Hung Le, Nancy F. Chen, Steven C.H. Hoi
- Abstract summary: We propose a novel framework of Reasoning Paths in Dialogue Context (PDC)
PDC model discovers information flows among dialogue turns through a semantic graph constructed based on lexical components in each question and answer.
Our model sequentially processes both visual and textual information through this reasoning path and the propagated features are used to generate the answer.
- Score: 73.04906599884868
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compared to traditional visual question answering, video-grounded dialogues
require additional reasoning over dialogue context to answer questions in a
multi-turn setting. Previous approaches to video-grounded dialogues mostly use
dialogue context as a simple text input without modelling the inherent
information flows at the turn level. In this paper, we propose a novel
framework of Reasoning Paths in Dialogue Context (PDC). PDC model discovers
information flows among dialogue turns through a semantic graph constructed
based on lexical components in each question and answer. PDC model then learns
to predict reasoning paths over this semantic graph. Our path prediction model
predicts a path from the current turn through past dialogue turns that contain
additional visual cues to answer the current question. Our reasoning model
sequentially processes both visual and textual information through this
reasoning path and the propagated features are used to generate the answer. Our
experimental results demonstrate the effectiveness of our method and provide
additional insights on how models use semantic dependencies in a dialogue
context to retrieve visual cues.
Related papers
- Unsupervised Extraction of Dialogue Policies from Conversations [3.102576158218633]
We show how Large Language Models can be instrumental in extracting dialogue policies from datasets.
We then propose a novel method for generating dialogue policies utilizing a controllable and interpretable graph-based methodology.
arXiv Detail & Related papers (2024-06-21T14:57:25Z) - Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue
Model [79.64376762489164]
PK-Chat is a Pointer network guided generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs.
The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge.
Based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences.
arXiv Detail & Related papers (2023-04-02T18:23:13Z) - CTRLStruct: Dialogue Structure Learning for Open-Domain Response
Generation [38.60073402817218]
Well-structured topic flow can leverage background information and predict future topics to help generate controllable and explainable responses.
We present a new framework for dialogue structure learning to effectively explore topic-level dialogue clusters as well as their transitions with unlabelled information.
Experiments on two popular open-domain dialogue datasets show our model can generate more coherent responses compared to some excellent dialogue models.
arXiv Detail & Related papers (2023-03-02T09:27:11Z) - VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution [79.05412803762528]
The visual dialog task requires an AI agent to interact with humans in multi-round dialogs based on a visual environment.
We propose VD-PCR, a novel framework to improve Visual Dialog understanding with Pronoun Coreference Resolution.
With the proposed implicit and explicit methods, VD-PCR achieves state-of-the-art experimental results on the VisDial dataset.
arXiv Detail & Related papers (2022-05-29T15:29:50Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - Reasoning with Multi-Structure Commonsense Knowledge in Visual Dialog [12.034554338597067]
We propose a novel model by Reasoning with Multi-structure Commonsense Knowledge (RMK)
In our model, the external knowledge is represented with sentence-level facts and graph-level facts.
On top of these multi-structure representations, our model can capture relevant knowledge and incorporate them into the vision and semantic features.
arXiv Detail & Related papers (2022-04-10T13:12:10Z) - Graph Based Network with Contextualized Representations of Turns in
Dialogue [0.0]
Dialogue-based relation extraction (RE) aims to extract relation(s) between two arguments that appear in a dialogue.
We propose the TUrn COntext awaRE Graph Convolutional Network (TUCORE-GCN) modeled by paying attention to the way people understand dialogues.
arXiv Detail & Related papers (2021-09-09T03:09:08Z) - ORD: Object Relationship Discovery for Visual Dialogue Generation [60.471670447176656]
We propose an object relationship discovery (ORD) framework to preserve the object interactions for visual dialogue generation.
A hierarchical graph convolutional network (HierGCN) is proposed to retain the object nodes and neighbour relationships locally, and then refines the object-object connections globally.
Experiments have proved that the proposed method can significantly improve the quality of dialogue by utilising the contextual information of visual relationships.
arXiv Detail & Related papers (2020-06-15T12:25:40Z) - Local Contextual Attention with Hierarchical Structure for Dialogue Act
Recognition [14.81680798372891]
We design a hierarchical model based on self-attention to capture intra-sentence and inter-sentence information.
Based on the found that the length of dialog affects the performance, we introduce a new dialog segmentation mechanism.
arXiv Detail & Related papers (2020-03-12T22:26:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.