Enhancement attacks in biomedical machine learning
- URL: http://arxiv.org/abs/2301.01885v2
- Date: Wed, 16 Aug 2023 22:31:41 GMT
- Title: Enhancement attacks in biomedical machine learning
- Authors: Matthew Rosenblatt, Javid Dadashkarimi, Dustin Scheinost
- Abstract summary: "enhancement attacks" may be a greater threat to biomedical machine learning.
We developed two techniques to drastically enhance prediction performance of classifiers with minimal changes to features.
Our results demonstrate the feasibility of minor data manipulations to achieve any desired prediction performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The prevalence of machine learning in biomedical research is rapidly growing,
yet the trustworthiness of such research is often overlooked. While some
previous works have investigated the ability of adversarial attacks to degrade
model performance in medical imaging, the ability to falsely improve
performance via recently-developed "enhancement attacks" may be a greater
threat to biomedical machine learning. In the spirit of developing attacks to
better understand trustworthiness, we developed two techniques to drastically
enhance prediction performance of classifiers with minimal changes to features:
1) general enhancement of prediction performance, and 2) enhancement of a
particular method over another. Our enhancement framework falsely improved
classifiers' accuracy from 50% to almost 100% while maintaining high feature
similarities between original and enhanced data (Pearson's r's>0.99).
Similarly, the method-specific enhancement framework was effective in falsely
improving the performance of one method over another. For example, a simple
neural network outperformed logistic regression by 17% on our enhanced dataset,
although no performance differences were present in the original dataset.
Crucially, the original and enhanced data were still similar (r=0.99). Our
results demonstrate the feasibility of minor data manipulations to achieve any
desired prediction performance, which presents an interesting ethical challenge
for the future of biomedical machine learning. These findings emphasize the
need for more robust data provenance tracking and other precautionary measures
to ensure the integrity of biomedical machine learning research.
Related papers
- Class-specific Data Augmentation for Plant Stress Classification [8.433217399526521]
We propose an approach for automated class-specific data augmentation using a genetic algorithm.
We demonstrate the utility of our approach on soybean [Glycine max (L.) Merr] stress classification where symptoms are observed on leaves.
Our approach yields substantial performance, achieving a mean-per-class accuracy of 97.61% and an overall accuracy of 98% on the soybean leaf stress dataset.
arXiv Detail & Related papers (2024-06-18T22:01:25Z) - Enhancing Activity Recognition After Stroke: Generative Adversarial Networks for Kinematic Data Augmentation [0.0]
Generalizability of machine learning models for wearable monitoring in stroke rehabilitation is often constrained by the limited scale and heterogeneity of available data.
Data augmentation addresses this challenge by adding computationally derived data to real data to enrich the variability represented in the training set.
This study employs Conditional Generative Adversarial Networks (cGANs) to create synthetic kinematic data from a publicly available dataset.
By training deep learning models on both synthetic and experimental data, we enhanced task classification accuracy: models incorporating synthetic data attained an overall accuracy of 80.0%, significantly higher than the 66.1% seen in models trained solely with real data
arXiv Detail & Related papers (2024-06-12T15:51:00Z) - Physical formula enhanced multi-task learning for pharmacokinetics prediction [54.13787789006417]
A major challenge for AI-driven drug discovery is the scarcity of high-quality data.
We develop a formula enhanced mul-ti-task learning (PEMAL) method that predicts four key parameters of pharmacokinetics simultaneously.
Our experiments reveal that PEMAL significantly lowers the data demand, compared to typical Graph Neural Networks.
arXiv Detail & Related papers (2024-04-16T07:42:55Z) - Explainability-Driven Leaf Disease Classification Using Adversarial
Training and Knowledge Distillation [2.2823100315094624]
This work focuses on plant leaf disease classification and explores three crucial aspects: adversarial training, model explainability, and model compression.
The robustness can be the price of the classification accuracy with performance reductions of 3%-20% for regular tests and gains of 50%-70% for adversarial attack tests.
arXiv Detail & Related papers (2023-12-30T21:48:20Z) - Dataset Optimization for Chronic Disease Prediction with Bio-Inspired
Feature Selection [0.32634122554913997]
The study contributes to the advancement of predictive analytics in the realm of chronic diseases.
The potential impact of this work extends to early intervention, precision medicine, and improved patient outcomes.
arXiv Detail & Related papers (2023-12-17T18:18:34Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Embracing assay heterogeneity with neural processes for markedly
improved bioactivity predictions [0.276240219662896]
Predicting the bioactivity of a ligand is one of the hardest and most important challenges in computer-aided drug discovery.
Despite years of data collection and curation efforts, bioactivity data remains sparse and heterogeneous.
We present a hierarchical meta-learning framework that exploits the information synergy across disparate assays.
arXiv Detail & Related papers (2023-08-17T16:26:58Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Robust Trajectory Prediction against Adversarial Attacks [84.10405251683713]
Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving systems.
These methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions.
In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks.
arXiv Detail & Related papers (2022-07-29T22:35:05Z) - Performance or Trust? Why Not Both. Deep AUC Maximization with
Self-Supervised Learning for COVID-19 Chest X-ray Classifications [72.52228843498193]
In training deep learning models, a compromise often must be made between performance and trust.
In this work, we integrate a new surrogate loss with self-supervised learning for computer-aided screening of COVID-19 patients.
arXiv Detail & Related papers (2021-12-14T21:16:52Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.