Analysis of Social Media Data using Multimodal Deep Learning for
Disaster Response
- URL: http://arxiv.org/abs/2004.11838v1
- Date: Tue, 14 Apr 2020 19:36:11 GMT
- Title: Analysis of Social Media Data using Multimodal Deep Learning for
Disaster Response
- Authors: Ferda Ofli, Firoj Alam and Muhammad Imran
- Abstract summary: We propose to use both text and image modalities of social media data to learn a joint representation using state-of-the-art deep learning techniques.
Experiments on real-world disaster datasets show that the proposed multimodal architecture yields better performance than models trained using a single modality.
- Score: 6.8889797054846795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimedia content in social media platforms provides significant information
during disaster events. The types of information shared include reports of
injured or deceased people, infrastructure damage, and missing or found people,
among others. Although many studies have shown the usefulness of both text and
image content for disaster response purposes, the research has been mostly
focused on analyzing only the text modality in the past. In this paper, we
propose to use both text and image modalities of social media data to learn a
joint representation using state-of-the-art deep learning techniques.
Specifically, we utilize convolutional neural networks to define a multimodal
deep learning architecture with a modality-agnostic shared representation.
Extensive experiments on real-world disaster datasets show that the proposed
multimodal architecture yields better performance than models trained using a
single modality (e.g., either text or image).
Related papers
- Advanced Multimodal Deep Learning Architecture for Image-Text Matching [33.8315200009152]
Image-text matching is a key multimodal task that aims to model the semantic association between images and text as a matching relationship.
We introduce an advanced multimodal deep learning architecture, which combines the high-level abstract representation ability of deep neural networks for visual information with the advantages of natural language processing models for text semantic understanding.
Experiments show that compared with existing image-text matching models, the optimized new model has significantly improved performance on a series of benchmark data sets.
arXiv Detail & Related papers (2024-06-13T08:32:24Z) - Iterative Adversarial Attack on Image-guided Story Ending Generation [37.42908817585858]
Multimodal learning involves developing models that can integrate information from various sources like images and texts.
Deep neural networks, which are the backbone of recent IgSEG models, are vulnerable to adversarial samples.
We propose an iterative adversarial attack method (Iterative-attack) that fuses image and text modality attacks.
arXiv Detail & Related papers (2023-05-16T06:19:03Z) - Harnessing the Power of Text-image Contrastive Models for Automatic
Detection of Online Misinformation [50.46219766161111]
We develop a self-learning model to explore the constrastive learning in the domain of misinformation identification.
Our model shows the superior performance of non-matched image-text pair detection when the training data is insufficient.
arXiv Detail & Related papers (2023-04-19T02:53:59Z) - Learning to Model Multimodal Semantic Alignment for Story Visualization [58.16484259508973]
Story visualization aims to generate a sequence of images to narrate each sentence in a multi-sentence story.
Current works face the problem of semantic misalignment because of their fixed architecture and diversity of input modalities.
We explore the semantic alignment between text and image representations by learning to match their semantic levels in the GAN-based generative model.
arXiv Detail & Related papers (2022-11-14T11:41:44Z) - MEDIC: A Multi-Task Learning Dataset for Disaster Image Classification [6.167082944123002]
We propose MEDIC, the largest social media image classification dataset for humanitarian response.
MEDIC consists of 71,198 images to address four different tasks in a multi-task learning setup.
This is the first dataset of its kind: social media image, disaster response, and multi-task learning research.
arXiv Detail & Related papers (2021-08-29T11:55:50Z) - Enhancing Social Relation Inference with Concise Interaction Graph and
Discriminative Scene Representation [56.25878966006678]
We propose an approach of textbfPRactical textbfInference in textbfSocial rtextbfElation (PRISE)
It concisely learns interactive features of persons and discriminative features of holistic scenes.
PRISE achieves 6.8$%$ improvement for domain classification in PIPA dataset.
arXiv Detail & Related papers (2021-07-30T04:20:13Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Cross-Media Keyphrase Prediction: A Unified Framework with
Multi-Modality Multi-Head Attention and Image Wordings [63.79979145520512]
We explore the joint effects of texts and images in predicting the keyphrases for a multimedia post.
We propose a novel Multi-Modality Multi-Head Attention (M3H-Att) to capture the intricate cross-media interactions.
Our model significantly outperforms the previous state of the art based on traditional attention networks.
arXiv Detail & Related papers (2020-11-03T08:44:18Z) - CommuNety: A Deep Learning System for the Prediction of Cohesive Social
Communities [14.839117147209603]
We propose CommuNety, a deep learning system for the prediction of cohesive social networks using images.
The proposed deep learning model consists of hierarchical CNN architecture to learn descriptive features related to each cohesive network.
The paper also proposes a novel Face Co-occurrence Frequency algorithm to quantify existence of people in images, and a novel photo ranking method to analyze the strength of relationship between different individuals in a predicted social network.
arXiv Detail & Related papers (2020-07-29T11:03:22Z) - Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [41.505920288928365]
multimodal data has inspired interest in cross-modal retrieval methods.
We propose novel within-modality losses which encourage semantic coherency in both the text and image subspaces.
Our method ensures that not only are paired images and texts close, but the expected image-image and text-text relationships are also observed.
arXiv Detail & Related papers (2020-07-16T20:32:54Z) - Image Segmentation Using Deep Learning: A Survey [58.37211170954998]
Image segmentation is a key topic in image processing and computer vision.
There has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models.
arXiv Detail & Related papers (2020-01-15T21:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.