Balancing Transparency and Risk: The Security and Privacy Risks of
Open-Source Machine Learning Models
- URL: http://arxiv.org/abs/2308.09490v1
- Date: Fri, 18 Aug 2023 11:59:15 GMT
- Title: Balancing Transparency and Risk: The Security and Privacy Risks of
Open-Source Machine Learning Models
- Authors: Dominik Hintersdorf, Lukas Struppek, Kristian Kersting
- Abstract summary: We present a comprehensive overview of common privacy and security threats associated with the use of open-source models.
By raising awareness of these dangers, we strive to promote the responsible and secure use of AI systems.
- Score: 31.658006126446175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The field of artificial intelligence (AI) has experienced remarkable progress
in recent years, driven by the widespread adoption of open-source machine
learning models in both research and industry. Considering the
resource-intensive nature of training on vast datasets, many applications opt
for models that have already been trained. Hence, a small number of key players
undertake the responsibility of training and publicly releasing large
pre-trained models, providing a crucial foundation for a wide range of
applications. However, the adoption of these open-source models carries
inherent privacy and security risks that are often overlooked. To provide a
concrete example, an inconspicuous model may conceal hidden functionalities
that, when triggered by specific input patterns, can manipulate the behavior of
the system, such as instructing self-driving cars to ignore the presence of
other vehicles. The implications of successful privacy and security attacks
encompass a broad spectrum, ranging from relatively minor damage like service
interruptions to highly alarming scenarios, including physical harm or the
exposure of sensitive user data. In this work, we present a comprehensive
overview of common privacy and security threats associated with the use of
open-source models. By raising awareness of these dangers, we strive to promote
the responsible and secure use of AI systems.
Related papers
- Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions [12.451936012379319]
Large Language Models (LLMs) represent a significant advancement in artificial intelligence, finding applications across various domains.
Their reliance on massive internet-sourced datasets for training brings notable privacy issues.
Certain application-specific scenarios may require fine-tuning these models on private data.
arXiv Detail & Related papers (2024-08-10T05:41:19Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Threats, Attacks, and Defenses in Machine Unlearning: A Survey [14.03428437751312]
Machine Unlearning (MU) has recently gained considerable attention due to its potential to achieve Safe AI.
This survey aims to fill the gap between the extensive number of studies on threats, attacks, and defenses in machine unlearning.
arXiv Detail & Related papers (2024-03-20T15:40:18Z) - Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning.
In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z) - Learnware: Small Models Do Big [69.88234743773113]
The prevailing big model paradigm, which has achieved impressive results in natural language processing and computer vision applications, has not yet addressed those issues, whereas becoming a serious source of carbon emissions.
This article offers an overview of the learnware paradigm, which attempts to enable users not need to build machine learning models from scratch, with the hope of reusing small models to do things even beyond their original purposes.
arXiv Detail & Related papers (2022-10-07T15:55:52Z) - Knowledge Augmented Machine Learning with Applications in Autonomous
Driving: A Survey [37.84106999449108]
This work provides an overview of existing techniques and methods that combine data-driven models with existing knowledge.
The identified approaches are structured according to the categories knowledge integration, extraction and conformity.
In particular, we address the application of the presented methods in the field of autonomous driving.
arXiv Detail & Related papers (2022-05-10T07:25:32Z) - From Machine Learning to Robotics: Challenges and Opportunities for
Embodied Intelligence [113.06484656032978]
Article argues that embodied intelligence is a key driver for the advancement of machine learning technology.
We highlight challenges and opportunities specific to embodied intelligence.
We propose research directions which may significantly advance the state-of-the-art in robot learning.
arXiv Detail & Related papers (2021-10-28T16:04:01Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Green Lighting ML: Confidentiality, Integrity, and Availability of
Machine Learning Systems in Deployment [4.2317391919680425]
In production machine learning, there is generally a hand-off from those who build a model to those who deploy a model.
In this hand-off, the engineers responsible for model deployment are often not privy to the details of the model.
In order to help alleviate this issue, automated systems for validating privacy and security of models need to be developed.
arXiv Detail & Related papers (2020-07-09T10:38:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.