TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter
- URL: http://arxiv.org/abs/2306.08310v2
- Date: Tue, 5 Dec 2023 00:17:22 GMT
- Title: TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter
- Authors: Yiqun Chen, James Zou
- Abstract summary: We introduce TWIGMA, a dataset encompassing over 800,000 gen-AI images collected from Jan 2021 to March 2023 on Twitter.
We find that gen-AI images possess distinctive characteristics and exhibit, on average, lower variability when compared to their non-gen-AI counterparts.
We observe a longitudinal shift in the themes of AI-generated images on Twitter, with users increasingly sharing artistically sophisticated content.
- Score: 29.77283532841167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent progress in generative artificial intelligence (gen-AI) has enabled
the generation of photo-realistic and artistically-inspiring photos at a single
click, catering to millions of users online. To explore how people use gen-AI
models such as DALLE and StableDiffusion, it is critical to understand the
themes, contents, and variations present in the AI-generated photos. In this
work, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), a
comprehensive dataset encompassing over 800,000 gen-AI images collected from
Jan 2021 to March 2023 on Twitter, with associated metadata (e.g., tweet text,
creation date, number of likes), available at
https://zenodo.org/records/8031785. Through a comparative analysis of TWIGMA
with natural images and human artwork, we find that gen-AI images possess
distinctive characteristics and exhibit, on average, lower variability when
compared to their non-gen-AI counterparts. Additionally, we find that the
similarity between a gen-AI image and natural images is inversely correlated
with the number of likes. Finally, we observe a longitudinal shift in the
themes of AI-generated images on Twitter, with users increasingly sharing
artistically sophisticated content such as intricate human portraits, whereas
their interest in simple subjects such as natural scenes and animals has
decreased. Our findings underscore the significance of TWIGMA as a unique data
resource for studying AI-generated images.
Related papers
- Could AI Trace and Explain the Origins of AI-Generated Images and Text? [53.11173194293537]
AI-generated content is increasingly prevalent in the real world.
adversaries might exploit large multimodal models to create images that violate ethical or legal standards.
Paper reviewers may misuse large language models to generate reviews without genuine intellectual effort.
arXiv Detail & Related papers (2025-04-05T20:51:54Z) - D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance [19.760989919485894]
We introduce an AI-Natural Image Discrepancy accessing benchmark (textitD-Judge)
We construct textitD-ANI, a dataset with 5,000 natural images and over 440,000 AIGIs generated by nine models using Text-to-Image (T2I), Image-to-Image (I2I), and Text and Image-to-Image (TI2I) prompts.
Our framework evaluates the discrepancy across five dimensions: naive image quality, semantic alignment, aesthetic appeal, downstream applicability, and human validation.
arXiv Detail & Related papers (2024-12-23T15:08:08Z) - MiRAGeNews: Multimodal Realistic AI-Generated News Detection [45.067211436589126]
We propose the MiRAGeNews dataset to combat the spread of AI-generated fake news.
Our dataset poses a significant challenge to humans.
We train a multi-modal detector that improves by +5.1% F-1 over state-of-the-art baselines.
arXiv Detail & Related papers (2024-10-11T17:58:02Z) - A Sanity Check for AI-generated Image Detection [49.08585395873425]
We present a sanity check on whether the task of AI-generated image detection has been solved.
To quantify the generalization of existing methods, we evaluate 9 off-the-shelf AI-generated image detectors on Chameleon dataset.
We propose AIDE (AI-generated Image DEtector with Hybrid Features), which leverages multiple experts to simultaneously extract visual artifacts and noise patterns.
arXiv Detail & Related papers (2024-06-27T17:59:49Z) - Development of a Dual-Input Neural Model for Detecting AI-Generated Imagery [0.0]
It is important to develop tools that are able to detect AI-generated images.
This paper proposes a dual-branch neural network architecture that takes both images and their Fourier frequency decomposition as inputs.
Our proposed model achieves an accuracy of 94% on the CIFAKE dataset, which significantly outperforms classic ML methods and CNNs.
arXiv Detail & Related papers (2024-06-19T16:42:04Z) - AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images [26.891299948581782]
We conduct the first large-scale investigation of the prevalence of AI-generated profile pictures on Twitter.
Our analysis of nearly 15 million Twitter profile pictures shows that 0.052% were artificially generated, confirming their notable presence on the platform.
The results also reveal several motives, including spamming and political amplification campaigns.
arXiv Detail & Related papers (2024-04-22T14:57:17Z) - AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024.
A subjective IQA experiment is conducted to assess human visual preferences from three perspectives.
We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - Creating Image Datasets in Agricultural Environments using DALL.E: Generative AI-Powered Large Language Model [0.4143603294943439]
The study used both approaches of image generation: text-to-image and image-to image (variation)
Images generated using image-to-image-based method were more realistic compared to those generated with text-to-image approach.
arXiv Detail & Related papers (2023-07-17T19:17:10Z) - ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic Creations [26.4215586218117]
This work investigates how people use text-to-image models to generate desired target images.
We created ArtWhisperer, an online game where users are given a target image and are tasked with iteratively finding a prompt that creates a similar-looking image as the target.
We recorded over 50,000 human-AI interactions; each interaction corresponds to one text prompt created by a user and the corresponding generated image.
arXiv Detail & Related papers (2023-06-13T21:10:45Z) - DeepfakeArt Challenge: A Benchmark Dataset for Generative AI Art Forgery and Data Poisoning Detection [57.51313366337142]
There has been growing concern over the use of generative AI for malicious purposes.
In the realm of visual content synthesis using generative AI, key areas of significant concern has been image forgery and data poisoning.
We introduce the DeepfakeArt Challenge, a large-scale challenge benchmark dataset designed specifically to aid in the building of machine learning algorithms for generative AI art forgery and data poisoning detection.
arXiv Detail & Related papers (2023-06-02T05:11:27Z) - Seeing is not always believing: Benchmarking Human and Model Perception
of AI-Generated Images [66.20578637253831]
There is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos.
This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content.
arXiv Detail & Related papers (2023-04-25T17:51:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.