Dynamic Task and Weight Prioritization Curriculum Learning for
Multimodal Imagery
- URL: http://arxiv.org/abs/2310.19109v2
- Date: Tue, 7 Nov 2023 14:59:17 GMT
- Title: Dynamic Task and Weight Prioritization Curriculum Learning for
Multimodal Imagery
- Authors: Huseyin Fuat Alsan, Taner Arsan
- Abstract summary: This paper explores post-disaster analytics using multimodal deep learning models trained with curriculum learning method.
Curriculum learning emulates the progressive learning sequence in human education by training deep learning models on increasingly complex data.
- Score: 0.5439020425819
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores post-disaster analytics using multimodal deep learning
models trained with curriculum learning method. Studying post-disaster
analytics is important as it plays a crucial role in mitigating the impact of
disasters by providing timely and accurate insights into the extent of damage
and the allocation of resources. We propose a curriculum learning strategy to
enhance the performance of multimodal deep learning models. Curriculum learning
emulates the progressive learning sequence in human education by training deep
learning models on increasingly complex data. Our primary objective is to
develop a curriculum-trained multimodal deep learning model, with a particular
focus on visual question answering (VQA) capable of jointly processing image
and text data, in conjunction with semantic segmentation for disaster analytics
using the
FloodNet\footnote{https://github.com/BinaLab/FloodNet-Challenge-EARTHVISION2021}
dataset. To achieve this, U-Net model is used for semantic segmentation and
image encoding. A custom built text classifier is used for visual question
answering. Existing curriculum learning methods rely on manually defined
difficulty functions. We introduce a novel curriculum learning approach termed
Dynamic Task and Weight Prioritization (DATWEP), which leverages a
gradient-based method to automatically decide task difficulty during curriculum
learning training, thereby eliminating the need for explicit difficulty
computation. The integration of DATWEP into our multimodal model shows
improvement on VQA performance. Source code is available at
https://github.com/fualsan/DATWEP.
Related papers
- Unlearnable Algorithms for In-context Learning [36.895152458323764]
In this paper, we focus on efficient unlearning methods for the task adaptation phase of a pretrained large language model.
We observe that an LLM's ability to do in-context learning for task adaptation allows for efficient exact unlearning of task adaptation training data.
We propose a new holistic measure of unlearning cost which accounts for varying inference costs.
arXiv Detail & Related papers (2024-02-01T16:43:04Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Self-Supervised Learning of Multi-Object Keypoints for Robotic
Manipulation [8.939008609565368]
In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning.
We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning approaches, and demonstrate its flexibility and effectiveness for sample-efficient policy learning.
arXiv Detail & Related papers (2022-05-17T13:15:07Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z) - Statistical Measures For Defining Curriculum Scoring Function [5.328970912536596]
We show improvements in performance with convolutional and fully-connected neural networks on real image datasets.
Motivated by our insights from implicit curriculum ordering, we introduce a simple curriculum learning strategy.
We also propose and study the performance of a dynamic curriculum learning algorithm.
arXiv Detail & Related papers (2021-02-27T07:25:49Z) - Curriculum Learning: A Survey [65.31516318260759]
Curriculum learning strategies have been successfully employed in all areas of machine learning.
We construct a taxonomy of curriculum learning approaches by hand, considering various classification criteria.
We build a hierarchical tree of curriculum learning methods using an agglomerative clustering algorithm.
arXiv Detail & Related papers (2021-01-25T20:08:32Z) - Video Understanding as Machine Translation [53.59298393079866]
We tackle a wide variety of downstream video understanding tasks by means of a single unified framework.
We report performance gains over the state-of-the-art on several downstream tasks including video classification (EPIC-Kitchens), question answering (TVQA), captioning (TVC, YouCook2, and MSR-VTT)
arXiv Detail & Related papers (2020-06-12T14:07:04Z) - Reducing Overlearning through Disentangled Representations by
Suppressing Unknown Tasks [8.517620051440005]
Existing deep learning approaches for learning visual features tend to overlearn and extract more information than what is required for the task at hand.
From a privacy preservation perspective, the input visual information is not protected from the model.
We propose a model-agnostic solution for reducing model overlearning by suppressing all the unknown tasks.
arXiv Detail & Related papers (2020-05-20T17:31:44Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.