AMUSED: An Annotation Framework of Multi-modal Social Media Data
- URL: http://arxiv.org/abs/2010.00502v2
- Date: Tue, 10 Aug 2021 11:41:16 GMT
- Title: AMUSED: An Annotation Framework of Multi-modal Social Media Data
- Authors: Gautam Kishore Shahi
- Abstract summary: The framework is designed to mitigate the issues of collecting and annotating social media data.
AMUSED can be applied in multiple application domains, as a use case, we have implemented the framework for collecting COVID-19 misinformation data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a semi-automated framework called AMUSED for
gathering multi-modal annotated data from the multiple social media platforms.
The framework is designed to mitigate the issues of collecting and annotating
social media data by cohesively combining machine and human in the data
collection process. From a given list of the articles from professional news
media or blog, AMUSED detects links to the social media posts from news
articles and then downloads contents of the same post from the respective
social media platform to gather details about that specific post. The framework
is capable of fetching the annotated data from multiple platforms like Twitter,
YouTube, Reddit. The framework aims to reduce the workload and problems behind
the data annotation from the social media platforms. AMUSED can be applied in
multiple application domains, as a use case, we have implemented the framework
for collecting COVID-19 misinformation data from different social media
platforms.
Related papers
- SocialQuotes: Learning Contextual Roles of Social Media Quotes on the Web [9.130915550141337]
We liken social media embeddings to quotes, formalize the page context as structured natural language signals, and identify a taxonomy of roles for quotes within the page context.
We release SocialQuotes, a new data set built from the Common Crawl of over 32 million social quotes, 8.3k of them with crowdsourced quote annotations.
arXiv Detail & Related papers (2024-07-22T19:21:01Z) - Robust Domain Misinformation Detection via Multi-modal Feature Alignment [49.89164555394584]
We propose a robust domain and cross-modal approach for multi-modal misinformation detection.
It reduces the domain shift by aligning the joint distribution of textual and visual modalities.
We also propose a framework that simultaneously considers application scenarios of domain generalization.
arXiv Detail & Related papers (2023-11-24T07:06:16Z) - ManiTweet: A New Benchmark for Identifying Manipulation of News on Social Media [74.93847489218008]
We present a novel task, identifying manipulation of news on social media, which aims to detect manipulation in social media posts and identify manipulated or inserted information.
To study this task, we have proposed a data collection schema and curated a dataset called ManiTweet, consisting of 3.6K pairs of tweets and corresponding articles.
Our analysis demonstrates that this task is highly challenging, with large language models (LLMs) yielding unsatisfactory performance.
arXiv Detail & Related papers (2023-05-23T16:40:07Z) - Decay No More: A Persistent Twitter Dataset for Learning Social Meaning [10.227026799075215]
We propose a new persistent English Twitter dataset for social meaning (PTSM)
PTSM consists of $17$ social meaning datasets in $10$ categories of tasks.
We experiment with two SOTA pre-trained language models and show that our PTSM can substitute the actual tweets with paraphrases with marginal performance loss.
arXiv Detail & Related papers (2022-04-10T06:07:54Z) - Exploiting BERT For Multimodal Target SentimentClassification Through
Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer.
We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model.
We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z) - Content-based Analysis of the Cultural Differences between TikTok and
Douyin [95.32409577885645]
Short-form video social media shifts away from the traditional media paradigm by telling the audience a dynamic story to attract their attention.
In particular, different combinations of everyday objects can be employed to represent a unique scene that is both interesting and understandable.
Offered by the same company, TikTok and Douyin are popular examples of such new media that has become popular in recent years.
The hypothesis that they express cultural differences together with media fashion and social idiosyncrasy is the primary target of our research.
arXiv Detail & Related papers (2020-11-03T01:47:49Z) - I-AID: Identifying Actionable Information from Disaster-related Tweets [0.0]
Social media plays a significant role in disaster management by providing valuable data about affected people, donations and help requests.
We propose I-AID, a multimodel approach to automatically categorize tweets into multi-label information types.
Our results indicate that I-AID outperforms state-of-the-art approaches in terms of weighted average F1 score by +6% and +4% on the TREC-IS dataset and COVID-19 Tweets, respectively.
arXiv Detail & Related papers (2020-08-04T19:07:50Z) - Sentiment Analysis on Social Media Content [0.0]
The aim of this paper is to present a model that can perform sentiment analysis of real data collected from Twitter.
Data in Twitter is highly unstructured which makes it difficult to analyze.
Our proposed model is different from prior work in this field because it combined the use of supervised and unsupervised machine learning algorithms.
arXiv Detail & Related papers (2020-07-04T17:03:30Z) - Reliable and Efficient Long-Term Social Media Monitoring [4.389610557232119]
This technical report presents a cloud-based data collection, pre-processing, and archiving infrastructure.
We show how this approach works in different cloud computing architectures, and how to adapt the method to collect streaming data from other social media platforms.
arXiv Detail & Related papers (2020-05-05T19:04:56Z) - Echo Chambers on Social Media: A comparative analysis [64.2256216637683]
We introduce an operational definition of echo chambers and perform a massive comparative analysis on 1B pieces of contents produced by 1M users on four social media platforms.
We infer the leaning of users about controversial topics and reconstruct their interaction networks by analyzing different features.
We find support for the hypothesis that platforms implementing news feed algorithms like Facebook may elicit the emergence of echo-chambers.
arXiv Detail & Related papers (2020-04-20T20:00:27Z) - I Know Where You Are Coming From: On the Impact of Social Media Sources
on AI Model Performance [79.05613148641018]
We will study the performance of different machine learning models when being learned on multi-modal data from different social networks.
Our initial experimental results reveal that social network choice impacts the performance.
arXiv Detail & Related papers (2020-02-05T11:10:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.