Brazilian Social Media Anti-vaccine Information Disorder Dataset -- Telegram (2020-2025)
- URL: http://arxiv.org/abs/2601.18622v1
- Date: Mon, 26 Jan 2026 15:59:28 GMT
- Title: Brazilian Social Media Anti-vaccine Information Disorder Dataset -- Telegram (2020-2025)
- Authors: João Phillipe Cardenuto, Ana Carolina Monari, Michelle Diniz Lopes, Leopoldo Lusquino Filho, Anderson Rocha,
- Abstract summary: Brazil has experienced a decline in vaccination coverage, reversing decades of public health progress achieved through the National Immunization Program (PNI)<n>Growing evidence points to the widespread circulation of vaccine-related misinformation on social media platforms.<n>This data paper introduces a curated dataset of about four million Telegram posts collected from 119 prominent Brazilian anti-vaccine channels between 2020 and 2025.
- Score: 3.8775488670058222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past decade, Brazil has experienced a decline in vaccination coverage, reversing decades of public health progress achieved through the National Immunization Program (PNI). Growing evidence points to the widespread circulation of vaccine-related misinformation -- particularly on social media platforms -- as a key factor driving this decline. Among these platforms, Telegram remains the only major platform permitting accessible and ethical data collection, offering insight into public channels where vaccine misinformation circulates extensively. This data paper introduces a curated dataset of about four million Telegram posts collected from 119 prominent Brazilian anti-vaccine channels between 2020 and 2025. The dataset includes message content, metadata, associated media, and classification related to vaccine posts, enabling researchers to examine how false or misleading information spreads, evolves, and influences public sentiment. By providing this resource, our aim is to support the scientific and public health community in developing evidence-based strategies to counter misinformation, promote trust in vaccination, and engage compassionately with individuals and communities affected by false narratives. The dataset and documentation are openly available for non-commercial research, under strict ethical and privacy guidelines at https://doi.org/10.25824/redu/5JIVDT
Related papers
- The Shifting Landscape of Vaccine Discourse: Insights From a Decade of Pre- to Post-COVID-19 Vaccine Posts on Social Media [61.575555311964706]
We analyze how English speakers talk about vaccines on social media to understand the evolving narrative around vaccines in social media posts.<n>We first introduce a novel dataset comprising 18.7 million curated posts on vaccine discourse from 2013 to 2022.<n>Our analysis shows that the COVID-19 pandemic led to complex shifts in X users' sentiment and discourse around vaccines.
arXiv Detail & Related papers (2025-11-20T22:28:59Z) - Post-Vaccination COVID-19 Data Analysis: Privacy and Ethics [1.7975345722349878]
The COVID-19 pandemic has severely affected the world in terms of health, economy and peace.<n>The vaccination process raises several questions about citizen privacy and misuse of personal data.<n>This paper introduces a blockchain-based application for verification and analysis of vaccination data.
arXiv Detail & Related papers (2024-12-01T11:41:32Z) - Vax-Culture: A Dataset for Studying Vaccine Discourse on Twitter [3.768191396638854]
Vaccine hesitancy continues to be a main challenge for public health officials during the COVID-19 pandemic.
We present Vax-Culture, a novel Twitter COVID-19 dataset consisting of 6373 vaccine-related tweets.
We hope this can lead to effective and targeted public health communication strategies for reaching individuals with anti-vaccine beliefs.
arXiv Detail & Related papers (2023-04-13T23:04:30Z) - Global misinformation spillovers in the online vaccination debate before
and during COVID-19 [5.1598868036106085]
Anti-vaccination views pervade online social media, fueling distrust in scientific expertise and increasing vaccine-hesitant individuals.
Here, we leverage 316 million vaccine-related Twitter messages in 18 languages to quantify misinformation flows between users exposed to anti-vaccination (no-vax) content.
We find that, during the pandemic, no-vax communities became more central in the country-specific debates and their cross-border connections strengthened, revealing a global Twitter anti-vaccination network.
arXiv Detail & Related papers (2022-11-21T14:32:37Z) - Dynamics and triggers of misinformation on vaccines [0.552480439325792]
We analyze 6 years of Italian vaccine debate across diverse social media platforms (Facebook, Instagram, Twitter, YouTube)
We first use the symbolic transfer entropy analysis of news production time-series to determine which category of sources, questionable or reliable, causally drives the agenda on vaccines.
We then leverage deep learning models capable to accurately classify vaccine-related content based on the conveyed stance and discussed topic.
arXiv Detail & Related papers (2022-07-25T15:35:48Z) - A Multilingual Dataset of COVID-19 Vaccination Attitudes on Twitter [4.696697601424039]
We describe the collection and release of a dataset of tweets related to COVID-19 vaccines.
This dataset consists of the IDs of 2,198,090 tweets collected from Western Europe, 17,934 of which are annotated with the originators' vaccination stances.
arXiv Detail & Related papers (2022-06-27T13:44:48Z) - "COVID-19 was a FIFA conspiracy #curropt": An Investigation into the
Viral Spread of COVID-19 Misinformation [60.268682953952506]
We estimate the extent to which misinformation has influenced the course of the COVID-19 pandemic using natural language processing models.
We provide a strategy to combat social media posts that are likely to cause widespread harm.
arXiv Detail & Related papers (2022-06-12T19:41:01Z) - Cross-lingual COVID-19 Fake News Detection [54.125563009333995]
We make the first attempt to detect COVID-19 misinformation in a low-resource language (Chinese) only using the fact-checked news in a high-resource language (English)
We propose a deep learning framework named CrossFake to jointly encode the cross-lingual news body texts and capture the news content.
Empirical results on our dataset demonstrate the effectiveness of CrossFake under the cross-lingual setting.
arXiv Detail & Related papers (2021-10-13T04:44:02Z) - Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [90.12602012910465]
We train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries.
Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
arXiv Detail & Related papers (2020-06-05T02:04:25Z) - Digital Ariadne: Citizen Empowerment for Epidemic Control [55.41644538483948]
The COVID-19 crisis represents the most dangerous threat to public health since the H1N1 pandemic of 1918.
Technology-assisted location and contact tracing, if broadly adopted, may help limit the spread of infectious diseases.
We present a tool, called 'diAry' or 'digital Ariadne', based on voluntary location and Bluetooth tracking on personal devices.
arXiv Detail & Related papers (2020-04-16T15:53:42Z) - Falling into the Echo Chamber: the Italian Vaccination Debate on Twitter [65.7192861893042]
We examine the extent to which the vaccination debate on Twitter is conductive to potential outreach to the vaccination hesitant.
We discover that the vaccination skeptics, as well as the advocates, reside in their own distinct "echo chambers"
At the center of these echo chambers we find the ardent supporters, for which we build highly accurate network- and content-based classifiers.
arXiv Detail & Related papers (2020-03-26T13:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.