Encrypted machine learning of molecular quantum properties
- URL: http://arxiv.org/abs/2212.04322v1
- Date: Mon, 5 Dec 2022 11:04:08 GMT
- Title: Encrypted machine learning of molecular quantum properties
- Authors: Jan Weinreich, Guido Falk von Rudorff, O. Anatole von Lilienfeld
- Abstract summary: Encrypting the prediction process can solve this problem by double-blind model evaluation.
We implement secure and computationally feasible encrypted machine learning models.
We find that encrypted predictions using kernel ridge regression models are a million times more expensive than without encryption.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large machine learning models with improved predictions have become widely
available in the chemical sciences. Unfortunately, these models do not protect
the privacy necessary within commercial settings, prohibiting the use of
potentially extremely valuable data by others. Encrypting the prediction
process can solve this problem by double-blind model evaluation and prohibits
the extraction of training or query data. However, contemporary ML models based
on fully homomorphic encryption or federated learning are either too expensive
for practical use or have to trade higher speed for weaker security. We have
implemented secure and computationally feasible encrypted machine learning
models using oblivious transfer enabling and secure predictions of molecular
quantum properties across chemical compound space. However, we find that
encrypted predictions using kernel ridge regression models are a million times
more expensive than without encryption. This demonstrates a dire need for a
compact machine learning model architecture, including molecular representation
and kernel matrix size, that minimizes model evaluation costs.
Related papers
- Chemical knowledge-informed framework for privacy-aware retrosynthesis learning [60.93245342663455]
Current machine learning-based retrosynthesis gathers reaction data from multiple sources into one single edge to train prediction models.
This paradigm poses considerable privacy risks as it necessitates broad data availability across organizational boundaries.
In the present study, we introduce the chemical knowledge-informed framework (CKIF), a privacy-preserving approach for learning retrosynthesis models.
arXiv Detail & Related papers (2025-02-26T13:13:24Z) - Training quantum machine learning models on cloud without uploading the data [0.0]
We propose a method that runs the parameterized quantum circuits before encoding the input data.
This enables a dataset owner to train machine learning models on quantum cloud platforms.
It is also capable of encoding a vast amount of data effectively at a later time using classical computations.
arXiv Detail & Related papers (2024-09-06T20:14:52Z) - Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Learning in the Dark: Privacy-Preserving Machine Learning using Function Approximation [1.8907108368038215]
Learning in the Dark is a privacy-preserving machine learning model that can classify encrypted images with high accuracy.
It is capable of performing high accuracy predictions by performing computations directly on encrypted data.
arXiv Detail & Related papers (2023-09-15T06:45:58Z) - QuMoS: A Framework for Preserving Security of Quantum Machine Learning
Model [10.543277412560233]
Security has always been a critical issue in machine learning (ML) applications.
Model-stealing attack is one of the most fundamental but vitally important issues.
We propose a novel framework, namely QuMoS, to preserve model security.
arXiv Detail & Related papers (2023-04-23T01:17:43Z) - nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales [65.01417261415833]
We present an approach to predict the pre-training loss based on our observations that Maximal Update Parametrization (muP) enables accurate fitting of scaling laws.
With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B.
Our goal with nanoLM is to empower researchers with limited resources to reach meaningful conclusions on large models.
arXiv Detail & Related papers (2023-04-14T00:45:01Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Partially Oblivious Neural Network Inference [4.843820624525483]
We show that for neural network models, like CNNs, some information leakage can be acceptable.
We experimentally demonstrate that in a CIFAR-10 network we can leak up to $80%$ of the model's weights with practically no security impact.
arXiv Detail & Related papers (2022-10-27T05:39:36Z) - Privacy-Preserving Chaotic Extreme Learning Machine with Fully
Homomorphic Encryption [5.010425616264462]
We propose a Chaotic Extreme Learning Machine and its encrypted form using Fully Homomorphic Encryption.
Our proposed method has performed either better or similar to the Traditional Extreme Learning Machine on most of the datasets.
arXiv Detail & Related papers (2022-08-04T11:29:52Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Defence against adversarial attacks using classical and quantum-enhanced
Boltzmann machines [64.62510681492994]
generative models attempt to learn the distribution underlying a dataset, making them inherently more robust to small perturbations.
We find improvements ranging from 5% to 72% against attacks with Boltzmann machines on the MNIST dataset.
arXiv Detail & Related papers (2020-12-21T19:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.