A Forecasting-Based DLP Approach for Data Security
- URL: http://arxiv.org/abs/2312.13704v1
- Date: Thu, 21 Dec 2023 10:14:27 GMT
- Title: A Forecasting-Based DLP Approach for Data Security
- Authors: Kishu Gupta, Ashwani Kush
- Abstract summary: This paper uses data statistical analysis to forecast the data access possibilities of any user in future.
The proposed approach makes use of renowned simple piecewise linear function for learning/training to model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sensitive data leakage is the major growing problem being faced by
enterprises in this technical era. Data leakage causes severe threats for
organization of data safety which badly affects the reputation of
organizations. Data leakage is the flow of sensitive data/information from any
data holder to an unauthorized destination. Data leak prevention (DLP) is set
of techniques that try to alleviate the threats which may hinder data security.
DLP unveils guilty user responsible for data leakage and ensures that user
without appropriate permission cannot access sensitive data and also provides
protection to sensitive data if sensitive data is shared accidentally. In this
paper, data leakage prevention (DLP) model is used to restrict/grant data
access permission to user, based on the forecast of their access to data. This
study provides a DLP solution using data statistical analysis to forecast the
data access possibilities of any user in future based on the access to data in
the past. The proposed approach makes use of renowned simple piecewise linear
function for learning/training to model. The results show that the proposed DLP
approach with high level of precision can correctly classify between users even
in cases of extreme data access.
Related papers
- Poisoning Attacks to Local Differential Privacy Protocols for Trajectory Data [14.934626547047763]
Trajectory data, which tracks movements through geographic locations, is crucial for improving real-world applications.
Local differential privacy (LDP) offers a solution by allowing individuals to locally perturb their trajectory data before sharing it.
Despite its privacy benefits, LDP protocols are vulnerable to data poisoning attacks, where attackers inject fake data to manipulate aggregated results.
arXiv Detail & Related papers (2025-03-06T02:31:45Z) - Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models [79.65071553905021]
We propose Data Advisor, a method for generating data that takes into account the characteristics of the desired dataset.
Data Advisor monitors the status of the generated data, identifies weaknesses in the current dataset, and advises the next iteration of data generation.
arXiv Detail & Related papers (2024-10-07T17:59:58Z) - Ungeneralizable Examples [70.76487163068109]
Current approaches to creating unlearnable data involve incorporating small, specially designed noises.
We extend the concept of unlearnable data to conditional data learnability and introduce textbfUntextbfGeneralizable textbfExamples (UGEs)
UGEs exhibit learnability for authorized users while maintaining unlearnability for potential hackers.
arXiv Detail & Related papers (2024-04-22T09:29:14Z) - DataCook: Crafting Anti-Adversarial Examples for Healthcare Data Copyright Protection [47.91906879320081]
DataCook operates by "cooking" the raw data before distribution, enabling the development of models that perform normally on this processed data.
During the deployment phase, the original test data must be also "cooked" through DataCook to ensure normal model performance.
The mechanism behind DataCook is by crafting anti-adversarial examples (AntiAdv), which are designed to enhance model confidence.
arXiv Detail & Related papers (2024-03-26T14:44:51Z) - A Learning oriented DLP System based on Classification Model [0.0]
Data leakage is the most critical issue being faced by organizations.
In order to mitigate the data leakage issues data leakage prevention systems (DLPSs) are deployed at various levels by the organizations.
arXiv Detail & Related papers (2023-12-21T10:23:16Z) - Towards Generalizable Data Protection With Transferable Unlearnable
Examples [50.628011208660645]
We present a novel, generalizable data protection method by generating transferable unlearnable examples.
To the best of our knowledge, this is the first solution that examines data privacy from the perspective of data distribution.
arXiv Detail & Related papers (2023-05-18T04:17:01Z) - Stop Uploading Test Data in Plain Text: Practical Strategies for
Mitigating Data Contamination by Evaluation Benchmarks [70.39633252935445]
Data contamination has become prevalent and challenging with the rise of models pretrained on large automatically-crawled corpora.
For closed models, the training data becomes a trade secret, and even for open models, it is not trivial to detect contamination.
We propose three strategies that can make a difference: (1) Test data made public should be encrypted with a public key and licensed to disallow derivative distribution; (2) demand training exclusion controls from closed API holders, and protect your test data by refusing to evaluate without them; and (3) avoid data which appears with its solution on the internet, and release the web-page context of internet-derived
arXiv Detail & Related papers (2023-05-17T12:23:38Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Secure Multiparty Computation for Synthetic Data Generation from
Distributed Data [7.370727048591523]
Legal and ethical restrictions on accessing relevant data inhibit data science research in critical domains such as health, finance, and education.
Existing approaches assume that the data holders supply their raw data to a trusted curator, who uses it as fuel for synthetic data generation.
We propose the first solution in which data holders only share encrypted data for differentially private synthetic data generation.
arXiv Detail & Related papers (2022-10-13T20:09:17Z) - Data Poisoning Attacks and Defenses to Crowdsourcing Systems [26.147716118854614]
We show that crowdsourcing is vulnerable to data poisoning attacks.
malicious clients provide carefully crafted data to corrupt the aggregated data.
We propose two defenses to reduce the impact of malicious clients.
arXiv Detail & Related papers (2021-02-18T06:03:48Z) - Deep Directed Information-Based Learning for Privacy-Preserving Smart
Meter Data Release [30.409342804445306]
We study the problem in the context of time series data and smart meters (SMs) power consumption measurements.
We introduce the Directed Information (DI) as a more meaningful measure of privacy in the considered setting.
Our empirical studies on real-world data sets from SMs measurements in the worst-case scenario show the existing trade-offs between privacy and utility.
arXiv Detail & Related papers (2020-11-20T13:41:11Z) - Privacy Preservation in Federated Learning: An insightful survey from
the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning.
Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee.
This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.