PASTA-4-PHT: A Pipeline for Automated Security and Technical Audits for the Personal Health Train
- URL: http://arxiv.org/abs/2412.01275v1
- Date: Mon, 02 Dec 2024 08:43:40 GMT
- Title: PASTA-4-PHT: A Pipeline for Automated Security and Technical Audits for the Personal Health Train
- Authors: Sascha Welten, Karl Kindermann, Ahmet Polat, Martin Görz, Maximilian Jugl, Laurenz Neumann, Alexander Neumann, Johannes Lohmöller, Jan Pennekamp, Stefan Decker,
- Abstract summary: This work discusses a PHT-aligned security and audit pipeline inspired by DevSecOps principles.
We introduce vulnerabilities into a PHT and apply our pipeline to five real-world PHTs, which have been utilised in real-world studies.
Ultimately, our work contributes to an increased security and overall transparency of data processing activities within the PHT framework.
- Score: 34.203290179252555
- License:
- Abstract: With the introduction of data protection regulations, the need for innovative privacy-preserving approaches to process and analyse sensitive data has become apparent. One approach is the Personal Health Train (PHT) that brings analysis code to the data and conducts the data processing at the data premises. However, despite its demonstrated success in various studies, the execution of external code in sensitive environments, such as hospitals, introduces new research challenges because the interactions of the code with sensitive data are often incomprehensible and lack transparency. These interactions raise concerns about potential effects on the data and increases the risk of data breaches. To address this issue, this work discusses a PHT-aligned security and audit pipeline inspired by DevSecOps principles. The automated pipeline incorporates multiple phases that detect vulnerabilities. To thoroughly study its versatility, we evaluate this pipeline in two ways. First, we deliberately introduce vulnerabilities into a PHT. Second, we apply our pipeline to five real-world PHTs, which have been utilised in real-world studies, to audit them for potential vulnerabilities. Our evaluation demonstrates that our designed pipeline successfully identifies potential vulnerabilities and can be applied to real-world studies. In compliance with the requirements of the GDPR for data management, documentation, and protection, our automated approach supports researchers using in their data-intensive work and reduces manual overhead. It can be used as a decision-making tool to assess and document potential vulnerabilities in code for data processing. Ultimately, our work contributes to an increased security and overall transparency of data processing activities within the PHT framework.
Related papers
- Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset [4.522849055040843]
This study audited the Helpful and Harmless dataset by Anthropic.
Our findings highlight the need for more nuanced, context-sensitive approaches to safety mitigation in large language models.
arXiv Detail & Related papers (2024-11-12T23:43:20Z) - AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - Data Quality Issues in Vulnerability Detection Datasets [1.6114012813668932]
Vulnerability detection is a crucial yet challenging task to identify potential weaknesses in software for cyber security.
Deep learning (DL) has made great progress in automating the detection process.
Many datasets have been created to train DL models for this purpose.
However, these datasets suffer from several issues that will lead to low detection accuracy of DL models.
arXiv Detail & Related papers (2024-10-08T13:31:29Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Generative AI for Secure and Privacy-Preserving Mobile Crowdsensing [74.58071278710896]
generative AI has attracted much attention from both academic and industrial fields.
Secure and privacy-preserving mobile crowdsensing (SPPMCS) has been widely applied in data collection/ acquirement.
arXiv Detail & Related papers (2024-05-17T04:00:58Z) - Secure and Verifiable Data Collaboration with Low-Cost Zero-Knowledge
Proofs [30.260427020479536]
In this paper, we propose a novel and highly efficient solution RiseFL for secure and verifiable data collaboration.
Firstly, we devise a probabilistic integrity check method that significantly reduces the cost of ZKP generation and verification.
Thirdly, we design a hybrid commitment scheme to satisfy Byzantine robustness with improved performance.
arXiv Detail & Related papers (2023-11-26T14:19:46Z) - Towards Generalizable Data Protection With Transferable Unlearnable
Examples [50.628011208660645]
We present a novel, generalizable data protection method by generating transferable unlearnable examples.
To the best of our knowledge, this is the first solution that examines data privacy from the perspective of data distribution.
arXiv Detail & Related papers (2023-05-18T04:17:01Z) - Bringing the Algorithms to the Data -- Secure Distributed Medical
Analytics using the Personal Health Train (PHT-meDIC) [1.451998131020241]
Personal Health Train (PHT) paradigm implements an 'algorithm to the data' paradigm.
We present PHT-meDIC, a productively deployed open-source implementation of the PHT concept.
arXiv Detail & Related papers (2022-12-07T06:29:15Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Privacy Preservation in Federated Learning: An insightful survey from
the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning.
Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee.
This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.