Continual VQA for Disaster Response Systems
- URL: http://arxiv.org/abs/2209.10320v1
- Date: Wed, 21 Sep 2022 12:45:51 GMT
- Title: Continual VQA for Disaster Response Systems
- Authors: Aditya Kane, V Manushree, Sahil Khose
- Abstract summary: Visual Question Answering (VQA) is a multi-modal task that involves answering questions from an input image.
Main challenge is the delay caused by the generation of labels in the assessment of the affected areas.
We deploy pre-trained CLIP model, which is trained on visual-image pairs.
We surpass previous state-of-the-art results on the FloodNet dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual Question Answering (VQA) is a multi-modal task that involves answering
questions from an input image, semantically understanding the contents of the
image and answering it in natural language. Using VQA for disaster management
is an important line of research due to the scope of problems that are answered
by the VQA system. However, the main challenge is the delay caused by the
generation of labels in the assessment of the affected areas. To tackle this,
we deployed pre-trained CLIP model, which is trained on visual-image pairs.
however, we empirically see that the model has poor zero-shot performance.
Thus, we instead use pre-trained embeddings of text and image from this model
for our supervised training and surpass previous state-of-the-art results on
the FloodNet dataset. We expand this to a continual setting, which is a more
real-life scenario. We tackle the problem of catastrophic forgetting using
various experience replay methods. Our training runs are available at:
https://wandb.ai/compyle/continual_vqa_final
Related papers
- VQAttack: Transferable Adversarial Attacks on Visual Question Answering
via Pre-trained Models [58.21452697997078]
We propose a novel VQAttack model, which can generate both image and text perturbations with the designed modules.
Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQAttack.
arXiv Detail & Related papers (2024-02-16T21:17:42Z) - Unleashing the Potential of Large Language Model: Zero-shot VQA for
Flood Disaster Scenario [6.820160182829294]
We propose a zero-shot VQA model named Zero-shot VQA for Flood Disaster Damage Assessment (ZFDDA)
With flood disaster as the main research object, we build a Freestyle Flood Disaster Image Question Answering dataset (FFD-IQA)
This new dataset expands the question types to include free-form, multiple-choice, and yes-no questions.
Our model uses well-designed chain of thought (CoT) demonstrations to unlock the potential of the large language model.
arXiv Detail & Related papers (2023-12-04T13:25:16Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for
Autonomous Driving Scenario [77.14723238359318]
NuScenesQA is the first benchmark for VQA in the autonomous driving scenario, encompassing 34K visual scenes and 460K question-answer pairs.
We leverage existing 3D detection annotations to generate scene graphs and design question templates manually.
We develop a series of baselines that employ advanced 3D detection and VQA techniques.
arXiv Detail & Related papers (2023-05-24T07:40:50Z) - Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA
Task [12.74065821307626]
VQA is an ambitious task aiming to answer any image-related question.
It is hard to build such a system once for all since the needs of users are continuously updated.
We propose a real-data-free replay-based method tailored for CL on VQA, named Scene Graph as Prompt for Replay.
arXiv Detail & Related papers (2022-08-24T12:00:02Z) - VQA-Aid: Visual Question Answering for Post-Disaster Damage Assessment
and Analysis [0.7614628596146599]
Visual Question Answering system integrated with Unmanned Aerial Vehicle (UAV) has a lot of potentials to advance the post-disaster damage assessment purpose.
We present our recently developed VQA dataset called textitHurMic-VQA collected during hurricane Michael.
arXiv Detail & Related papers (2021-06-19T18:28:16Z) - Found a Reason for me? Weakly-supervised Grounded Visual Question
Answering using Capsules [85.98177341704675]
The problem of grounding VQA tasks has seen an increased attention in the research community recently.
We propose a visual capsule module with a query-based selection mechanism of capsule features.
We show that integrating the proposed capsule module in existing VQA systems significantly improves their performance on the weakly supervised grounding task.
arXiv Detail & Related papers (2021-05-11T07:45:32Z) - Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a
Class-imbalance View [129.392671317356]
We propose to interpret the language prior problem in VQA from a class-imbalance view.
It explicitly reveals why the VQA model tends to produce a frequent yet obviously wrong answer.
We also justify the validity of the class imbalance interpretation scheme on other computer vision tasks, such as face recognition and image classification.
arXiv Detail & Related papers (2020-10-30T00:57:17Z) - C3VQG: Category Consistent Cyclic Visual Question Generation [51.339348810676896]
Visual Question Generation (VQG) is the task of generating natural questions based on an image.
In this paper, we try to exploit the different visual cues and concepts in an image to generate questions using a variational autoencoder (VAE) without ground-truth answers.
Our approach solves two major shortcomings of existing VQG systems: (i) minimize the level of supervision and (ii) replace generic questions with category relevant generations.
arXiv Detail & Related papers (2020-05-15T20:25:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.