Related papers: Uncertainty-based Visual Question Answering: Estimating Semantic Inconsistency between Image and Knowledge Base

Uncertainty-based Visual Question Answering: Estimating Semantic Inconsistency between Image and Knowledge Base

URL: http://arxiv.org/abs/2207.13242v1
Date: Wed, 27 Jul 2022 01:58:29 GMT
Title: Uncertainty-based Visual Question Answering: Estimating Semantic Inconsistency between Image and Knowledge Base
Authors: Jinyeong Chae and Jihie Kim
Abstract summary: KVQA task aims to answer questions that require additional external knowledge as well as an understanding of images and questions. Recent studies on KVQA inject an external knowledge in a multi-modal form, and as more knowledge is used, irrelevant information may be added and can confuse the question answering.
Score: 0.7081604594416336
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Knowledge-based visual question answering (KVQA) task aims to answer questions that require additional external knowledge as well as an understanding of images and questions. Recent studies on KVQA inject an external knowledge in a multi-modal form, and as more knowledge is used, irrelevant information may be added and can confuse the question answering. In order to properly use the knowledge, this study proposes the following: 1) we introduce a novel semantic inconsistency measure computed from caption uncertainty and semantic similarity; 2) we suggest a new external knowledge assimilation method based on the semantic inconsistency measure and apply it to integrate explicit knowledge and implicit knowledge for KVQA; 3) the proposed method is evaluated with the OK-VQA dataset and achieves the state-of-the-art performance.

Related papers

Knowledge Condensation and Reasoning for Knowledge-based VQA [20.808840633377343]
Recent studies retrieve the knowledge passages from external knowledge bases and then use them to answer questions. We propose two synergistic models: Knowledge Condensation model and Knowledge Reasoning model. Our method achieves state-of-the-art performance on knowledge-based VQA datasets.
arXiv Detail & Related papers (2024-03-15T06:06:06Z)
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering [61.53454387743701]
We propose CPACE, a concept-centric Prompt-bAsed Contrastive Explanation Generation model. CPACE converts obtained symbolic knowledge into a contrastive explanation for better distinguishing the differences among given candidates. We conduct a series of experiments on three widely-used question-answering datasets: CSQA, QASC, and OBQA.
arXiv Detail & Related papers (2023-05-14T12:12:24Z)
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering [74.90418840431425]
We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions. Our approach starts by imitating knowledge generated by GPT-3, then learns to generate its own knowledge via reinforcement learning. Our work is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of knowledge elicited from GPT-3 for commonsense QA.
arXiv Detail & Related papers (2022-10-06T17:34:06Z)
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA [67.75989848202343]
This paper presents a unified end-to-end retriever-reader framework towards knowledge-based VQA. We shed light on the multi-modal implicit knowledge from vision-language pre-training models to mine its potential in knowledge reasoning. Our scheme is able to not only provide guidance for knowledge retrieval, but also drop these instances potentially error-prone towards question answering.
arXiv Detail & Related papers (2022-06-30T02:35:04Z)
Coarse-to-Careful: Seeking Semantic-related Knowledge for Open-domain Commonsense Question Answering [12.406729445165857]
It is prevalent to utilize external knowledge to help machine answer questions that need background commonsense. We propose a semantic-driven knowledge-aware QA framework, which controls the knowledge injection in a coarse-to-careful fashion.
arXiv Detail & Related papers (2021-07-04T10:56:36Z)
Contextualized Knowledge-aware Attentive Neural Network: Enhancing Answer Selection with Knowledge [77.77684299758494]
We extensively investigate approaches to enhancing the answer selection model with external knowledge from knowledge graph (KG) First, we present a context-knowledge interaction learning framework, Knowledge-aware Neural Network (KNN), which learns the QA sentence representations by considering a tight interaction with the external knowledge from KG and the textual information. To handle the diversity and complexity of KG information, we propose a Contextualized Knowledge-aware Attentive Neural Network (CKANN), which improves the knowledge representation learning with structure information via a customized Graph Convolutional Network (GCN) and comprehensively learns context-based and knowledge-based sentence representation via
arXiv Detail & Related papers (2021-04-12T05:52:20Z)
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA [107.7091094498848]
One of the most challenging question types in VQA is when answering the question requires outside knowledge not present in the image. In this work we study open-domain knowledge, the setting when the knowledge required to answer a question is not given/annotated, neither at training nor test time. We tap into two types of knowledge representations and reasoning. First, implicit knowledge which can be learned effectively from unsupervised language pre-training and supervised training data with transformer-based models.
arXiv Detail & Related papers (2020-12-20T20:13:02Z)
Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding [140.5911760063681]
We propose a novel dataset named Knowledge-Routed Visual Question Reasoning for VQA model evaluation. We generate the question-answer pair based on both the Visual Genome scene graph and an external knowledge base with controlled programs.
arXiv Detail & Related papers (2020-12-14T00:33:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.