You Still See Me: How Data Protection Supports the Architecture of ML
Surveillance
- URL: http://arxiv.org/abs/2402.06609v2
- Date: Sun, 18 Feb 2024 14:59:25 GMT
- Title: You Still See Me: How Data Protection Supports the Architecture of ML
Surveillance
- Authors: Rui-Jie Yew, Lucy Qin, Suresh Venkatasubramanian
- Abstract summary: Human data forms the backbone of machine learning.
Data protection laws have strong bearing on how ML systems are governed.
We argue that data protection techniques are often instrumentalized in ways that support infrastructures of surveillance.
- Score: 6.731053352452566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human data forms the backbone of machine learning. Data protection laws thus
have strong bearing on how ML systems are governed. Given that most
requirements in data protection laws accompany the processing of personal data,
organizations have an incentive to keep their data out of legal scope. This
makes the development and application of certain privacy-preserving
techniques--data protection techniques--an important strategy for ML
compliance. In this paper, we examine the impact of a rhetoric that deems data
wrapped in these techniques as data that is "good-to-go". We show how their
application in the development of ML systems--from private set intersection as
part of dataset curation to homomorphic encryption and federated learning as
part of model computation--can further support individual monitoring and data
consolidation. With data accumulation at the core of how the ML pipeline is
configured, we argue that data protection techniques are often instrumentalized
in ways that support infrastructures of surveillance, rather than in ways that
protect individuals associated with data. Finally, we propose technology and
policy strategies to evaluate data protection techniques in light of the
protections they actually confer. We conclude by highlighting the role that
technologists might play in devising policies that combat surveillance ML
technologies.
Related papers
- The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - State-of-the-Art Approaches to Enhancing Privacy Preservation of Machine Learning Datasets: A Survey [0.0]
This paper examines the evolving landscape of machine learning (ML) and its profound impact across various sectors.
It focuses on the emerging field of Privacy-preserving Machine Learning (PPML)
As ML applications become increasingly integral to industries like telecommunications, financial technology, and surveillance, they raise significant privacy concerns.
arXiv Detail & Related papers (2024-02-25T17:31:06Z) - Towards Generalizable Data Protection With Transferable Unlearnable
Examples [50.628011208660645]
We present a novel, generalizable data protection method by generating transferable unlearnable examples.
To the best of our knowledge, this is the first solution that examines data privacy from the perspective of data distribution.
arXiv Detail & Related papers (2023-05-18T04:17:01Z) - An Example of Privacy and Data Protection Best Practices for Biometrics
Data Processing in Border Control: Lesson Learned from SMILE [0.9442139459221784]
Misuse of data, compromising the privacy of individuals and/or authorized processing of data may be irreversible.
This is partly due to the lack of methods and guidance for the integration of data protection and privacy by design in the system development process.
We present an example of privacy and data protection best practices to provide more guidance for data controllers and developers.
arXiv Detail & Related papers (2022-01-10T15:34:43Z) - Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data.
FL and related techniques are often described as privacy-preserving.
We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z) - Learning to Limit Data Collection via Scaling Laws: Data Minimization
Compliance in Practice [62.44110411199835]
We build on literature in machine learning law to propose framework for limiting collection based on data interpretation that ties data to system performance.
We formalize a data minimization criterion based on performance curve derivatives and provide an effective and interpretable piecewise power law technique.
arXiv Detail & Related papers (2021-07-16T19:59:01Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Privacy Preservation in Federated Learning: An insightful survey from
the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning.
Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee.
This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z) - Trustworthy AI Inference Systems: An Industry Research View [58.000323504158054]
We provide an industry research view for approaching the design, deployment, and operation of trustworthy AI inference systems.
We highlight opportunities and challenges in AI systems using trusted execution environments.
We outline areas of further development that require the global collective attention of industry, academia, and government researchers.
arXiv Detail & Related papers (2020-08-10T23:05:55Z) - ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the
Privacy Risks of Machine Learning [10.190911271176201]
Machine learning models pose an additional privacy risk to the data by indirectly revealing about it through the model predictions and parameters.
There is an immediate need for a tool that can quantify the privacy risk to data from models.
We present ML Privacy Meter, a tool that can quantify the privacy risk to data from models through state of the art membership inference attack techniques.
arXiv Detail & Related papers (2020-07-18T06:21:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.