Combining Generative Artificial Intelligence (AI) and the Internet:
Heading towards Evolution or Degradation?
- URL: http://arxiv.org/abs/2303.01255v1
- Date: Fri, 17 Feb 2023 17:39:41 GMT
- Title: Combining Generative Artificial Intelligence (AI) and the Internet:
Heading towards Evolution or Degradation?
- Authors: Gonzalo Mart\'inez, Lauren Watson, Pedro Reviriego, Jos\'e Alberto
Hern\'andez, Marc Juarez, Rik Sarkar
- Abstract summary: generative AI tools that can generate realistic images or text have taken the Internet by storm.
Future versions of generative AI tools will be trained with Internet data that is a mix of original and AI-generated data.
This raises a few intriguing questions: how will future versions of generative AI tools behave when trained on a mixture of real and AI generated data?
- Score: 6.62688326060372
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the span of a few months, generative Artificial Intelligence (AI) tools
that can generate realistic images or text have taken the Internet by storm,
making them one of the technologies with fastest adoption ever. Some of these
generative AI tools such as DALL-E, MidJourney, or ChatGPT have gained wide
public notoriety. Interestingly, these tools are possible because of the
massive amount of data (text and images) available on the Internet. The tools
are trained on massive data sets that are scraped from Internet sites. And now,
these generative AI tools are creating massive amounts of new data that are
being fed into the Internet. Therefore, future versions of generative AI tools
will be trained with Internet data that is a mix of original and AI-generated
data. As time goes on, a mixture of original data and data generated by
different versions of AI tools will populate the Internet. This raises a few
intriguing questions: how will future versions of generative AI tools behave
when trained on a mixture of real and AI generated data? Will they evolve with
the new data sets or degenerate? Will evolution introduce biases in subsequent
generations of generative AI tools? In this document, we explore these
questions and report some very initial simulation results using a simple
image-generation AI tool. These results suggest that the quality of the
generated images degrades as more AI-generated data is used for training thus
suggesting that generative AI may degenerate. Although these results are
preliminary and cannot be generalised without further study, they serve to
illustrate the potential issues of the interaction between generative AI and
the Internet.
Related papers
- Measuring Human Contribution in AI-Assisted Content Generation [68.03658922067487]
This study raises the research question of measuring human contribution in AI-assisted content generation.
By calculating mutual information between human input and AI-assisted output relative to self-information of AI-assisted output, we quantify the proportional information contribution of humans in content generation.
arXiv Detail & Related papers (2024-08-27T05:56:04Z) - Synthetic data: How could it be used for infectious disease research? [0.16752458252726457]
Concerns have been raised about potential negative factors associated with the possibilities of artificial dataset generation.
These include the potential misuse of generative artificial intelligence in fields such as cybercrime.
Synthetic data offers significant benefits, particularly in data privacy, research, in balancing datasets and reducing bias in machine learning models.
arXiv Detail & Related papers (2024-07-03T17:13:04Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - AI for the Generation and Testing of Ideas Towards an AI Supported
Knowledge Development Environment [2.0305676256390934]
We discuss how generative AI can boost idea generation by eliminating human bias.
We also describe how search can verify facts, logic, and context.
This paper introduces a system for knowledge workers, Generate And Search Test, enabling individuals to efficiently create solutions.
arXiv Detail & Related papers (2023-07-17T22:17:40Z) - Towards Understanding the Interplay of Generative Artificial
Intelligence and the Internet [6.62688326060372]
generative AI tools can generate realistic images or text, such as DALL-E, MidJourney, or ChatGPT.
These tools are possible due to the massive amount of data (text and images) that is publicly available through the Internet.
Future versions of generative AI tools will be trained with a mix of human-created and AI-generated content.
This interaction raises many questions: how will future versions of generative AI tools behave when trained on a mixture of real and AI generated data?
arXiv Detail & Related papers (2023-06-08T11:14:51Z) - Should ChatGPT and Bard Share Revenue with Their Data Providers? A New
Business Model for the AI Era [4.304168813971867]
Large AI tools, such as large language models, always require more and better quality data to continuously improve.
Current copyright laws limit their access to various types of data.
A completely new revenue-sharing business model, which must be almost independent of AI tools, needs to establish a prompt-based scoring system to measure data engagement.
arXiv Detail & Related papers (2023-05-04T05:21:09Z) - Seeing is not always believing: Benchmarking Human and Model Perception
of AI-Generated Images [66.20578637253831]
There is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos.
This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content.
arXiv Detail & Related papers (2023-04-25T17:51:59Z) - Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI.
In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals.
We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z) - A Comprehensive Survey of AI-Generated Content (AIGC): A History of
Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC)
The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z) - Empowering Things with Intelligence: A Survey of the Progress,
Challenges, and Opportunities in Artificial Intelligence of Things [98.10037444792444]
We show how AI can empower the IoT to make it faster, smarter, greener, and safer.
First, we present progress in AI research for IoT from four perspectives: perceiving, learning, reasoning, and behaving.
Finally, we summarize some promising applications of AIoT that are likely to profoundly reshape our world.
arXiv Detail & Related papers (2020-11-17T13:14:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.