Identifying and Mitigating the Security Risks of Generative AI
- URL: http://arxiv.org/abs/2308.14840v4
- Date: Fri, 29 Dec 2023 00:30:34 GMT
- Title: Identifying and Mitigating the Security Risks of Generative AI
- Authors: Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen,
Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil
Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha,
Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar
Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang
- Abstract summary: This paper reports the findings of a workshop held at Google on the dual-use dilemma posed by GenAI.
GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks.
We discuss short-term and long-term goals for the community on this topic.
- Score: 179.2384121957896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Every major technical invention resurfaces the dual-use dilemma -- the new
technology has the potential to be used for good as well as for harm.
Generative AI (GenAI) techniques, such as large language models (LLMs) and
diffusion models, have shown remarkable capabilities (e.g., in-context
learning, code-completion, and text-to-image generation and editing). However,
GenAI can be used just as well by attackers to generate new attacks and
increase the velocity and efficacy of existing attacks.
This paper reports the findings of a workshop held at Google (co-organized by
Stanford University and the University of Wisconsin-Madison) on the dual-use
dilemma posed by GenAI. This paper is not meant to be comprehensive, but is
rather an attempt to synthesize some of the interesting findings from the
workshop. We discuss short-term and long-term goals for the community on this
topic. We hope this paper provides both a launching point for a discussion on
this important topic as well as interesting problems that the research
community can work to address.
Related papers
- Generative artificial intelligence in dentistry: Current approaches and future challenges [0.0]
generative AI (GenAI) models bridge the usability gap of AI by providing a natural language interface to interact with complex models.
In dental education, the student now has the opportunity to solve a plethora of questions by only prompting a GenAI model.
GenAI can also be used in dental research, where the applications range from new drug discovery to assistance in academic writing.
arXiv Detail & Related papers (2024-07-24T03:33:47Z) - Securing the Future of GenAI: Policy and Technology [50.586585729683776]
Governments globally are grappling with the challenge of regulating GenAI, balancing innovation against safety.
A workshop co-organized by Google, University of Wisconsin, Madison, and Stanford University aimed to bridge this gap between GenAI policy and technology.
This paper summarizes the discussions during the workshop which addressed questions, such as: How regulation can be designed without hindering technological progress?
arXiv Detail & Related papers (2024-05-21T20:30:01Z) - Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda [1.8592384822257952]
We elaborate on why XAI has gained importance with the rise of GenAI and its challenges for explainability research.
We also unveil novel and emerging desiderata that explanations should fulfill, covering aspects such as verifiability, interactivity, security, and cost.
arXiv Detail & Related papers (2024-04-15T08:18:16Z) - The World of Generative AI: Deepfakes and Large Language Models [2.216882190540723]
Deepfakes and Large Language Models (LLMs) are two examples of Generative Artificial Intelligence (GenAI)
Deepfakes, in particular, pose an alarming threat to society as they are capable of spreading misinformation and changing the truth.
This article tries to find out the in language between them.
arXiv Detail & Related papers (2024-02-06T20:18:32Z) - Prompt Smells: An Omen for Undesirable Generative AI Outputs [4.105236597768038]
We propose two new concepts that will aid the research community in addressing limitations associated with the application of GenAI models.
First, we propose a definition for the "desirability" of GenAI outputs and three factors which are observed to influence it.
Second, drawing inspiration from Martin Fowler's code smells, we propose the concept of "prompt smells" and the adverse effects they are observed to have on the desirability of GenAI outputs.
arXiv Detail & Related papers (2024-01-23T10:10:01Z) - Towards more Practical Threat Models in Artificial Intelligence Security [66.67624011455423]
Recent works have identified a gap between research and practice in artificial intelligence security.
We revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice.
arXiv Detail & Related papers (2023-11-16T16:09:44Z) - GenAI Against Humanity: Nefarious Applications of Generative Artificial
Intelligence and Large Language Models [11.323961700172175]
This article serves as a synthesis of rigorous research presented on the risks of GenAI and misuse of LLMs.
We'll uncover the societal implications that ripple through the GenAI revolution we are witnessing.
The lines between the virtual and the real worlds are blurring, and the consequences of potential GenAI's nefarious applications impact us all.
arXiv Detail & Related papers (2023-10-01T17:25:56Z) - A LLM Assisted Exploitation of AI-Guardian [57.572998144258705]
We evaluate the robustness of AI-Guardian, a recent defense to adversarial examples published at IEEE S&P 2023.
We write none of the code to attack this model, and instead prompt GPT-4 to implement all attack algorithms following our instructions and guidance.
This process was surprisingly effective and efficient, with the language model at times producing code from ambiguous instructions faster than the author of this paper could have done.
arXiv Detail & Related papers (2023-07-20T17:33:25Z) - Threat of Adversarial Attacks on Deep Learning in Computer Vision:
Survey II [86.51135909513047]
Deep Learning is vulnerable to adversarial attacks that can manipulate its predictions.
This article reviews the contributions made by the computer vision community in adversarial attacks on deep learning.
It provides definitions of technical terminologies for non-experts in this domain.
arXiv Detail & Related papers (2021-08-01T08:54:47Z) - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News [57.9843300852526]
We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
arXiv Detail & Related papers (2020-09-16T14:13:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.