Setting the Course, but Forgetting to Steer: Analyzing Compliance with GDPR's Right of Access to Data by Instagram, TikTok, and YouTube
- URL: http://arxiv.org/abs/2502.11208v1
- Date: Sun, 16 Feb 2025 17:15:11 GMT
- Title: Setting the Course, but Forgetting to Steer: Analyzing Compliance with GDPR's Right of Access to Data by Instagram, TikTok, and YouTube
- Authors: Sai Keerthana Karnam, Abhisek Dash, Sepehr Mousavi, Stefan Bechtold, Krishna P. Gummadi, Animesh Mukherjee, Ingmar Weber, Savvas Zannettou,
- Abstract summary: We assess the comprehensibility and reliability of data download packages (DDPs) provided by three major social media platforms.
By recruiting 400 participants across four countries, we assess comprehensibility of DDPs across various requirements.
Also by leveraging automated bots and user-donated DDPs, we evaluate reliability of DDPs across three platforms.
- Score: 13.933510872380742
- License:
- Abstract: The comprehensibility and reliability of data download packages (DDPs) provided under the General Data Protection Regulation's (GDPR) right of access are vital for both individuals and researchers. These DDPs enable users to understand and control their personal data, yet issues like complexity and incomplete information often limit their utility. Also, despite their growing use in research to study emerging online phenomena, little attention has been given to systematically assessing the reliability and comprehensibility of DDPs. To bridge this research gap, in this work, we perform a comparative analysis to assess the comprehensibility and reliability of DDPs provided by three major social media platforms, namely, TikTok, Instagram, and YouTube. By recruiting 400 participants across four countries, we assess the comprehensibility of DDPs across various requirements, including conciseness, transparency, intelligibility, and clear and plain language. Also, by leveraging automated bots and user-donated DDPs, we evaluate the reliability of DDPs across the three platforms. Among other things, we find notable differences across the three platforms in the data categories included in DDPs, inconsistencies in adherence to the GDPR requirements, and gaps in the reliability of the DDPs across platforms. Finally, using large language models, we demonstrate the feasibility of easily providing more comprehensible DDPs.
Related papers
- Adaptive PII Mitigation Framework for Large Language Models [2.694044579874688]
This paper introduces an adaptive system for mitigating risk of Personally Identifiable Information (PII) and Sensitive Personal Information (SPI)
The system uses advanced NLP techniques, context-aware analysis, and policy-driven masking to ensure regulatory compliance.
Benchmarks highlight the system's effectiveness, with an F1 score of 0.95 for Passport Numbers.
arXiv Detail & Related papers (2025-01-21T19:22:45Z) - Are Data Experts Buying into Differentially Private Synthetic Data? Gathering Community Perspectives [14.736115103446101]
In the United States, differential privacy (DP) is the dominant technical operationalization of privacy-preserving data analysis.
This study qualitatively examines one class of DP mechanisms: private data synthesizers.
arXiv Detail & Related papers (2024-12-17T15:50:14Z) - PASTA-4-PHT: A Pipeline for Automated Security and Technical Audits for the Personal Health Train [34.203290179252555]
This work discusses a PHT-aligned security and audit pipeline inspired by DevSecOps principles.
We introduce vulnerabilities into a PHT and apply our pipeline to five real-world PHTs, which have been utilised in real-world studies.
Ultimately, our work contributes to an increased security and overall transparency of data processing activities within the PHT framework.
arXiv Detail & Related papers (2024-12-02T08:43:40Z) - An Empirical Study on Compliance with Ranking Transparency in the
Software Documentation of EU Online Platforms [7.461555266672227]
This study empirically evaluate the compliance of six major platforms (Amazon, Bing, Booking, Google, Tripadvisor, and Yahoo)
We introduce and test automated compliance assessment tools based on ChatGPT and information retrieval technology.
Our findings could help enhance regulatory compliance and align with the United Nations Sustainable Development Goal 10.3.
arXiv Detail & Related papers (2023-12-22T16:08:32Z) - FairDP: Certified Fairness with Differential Privacy [55.51579601325759]
This paper introduces FairDP, a novel training mechanism designed to provide group fairness certification for the trained model's decisions.
The key idea of FairDP is to train models for distinct individual groups independently, add noise to each group's gradient for data privacy protection, and integrate knowledge from group models to formulate a model that balances privacy, utility, and fairness in downstream tasks.
arXiv Detail & Related papers (2023-05-25T21:07:20Z) - DataPerf: Benchmarks for Data-Centric AI Development [81.03754002516862]
DataPerf is a community-led benchmark suite for evaluating ML datasets and data-centric algorithms.
We provide an open, online platform with multiple rounds of challenges to support this iterative development.
The benchmarks, online evaluation platform, and baseline implementations are open source.
arXiv Detail & Related papers (2022-07-20T17:47:54Z) - How Do Socio-Demographic Patterns Define Digital Privacy Divide? [0.5571177307684636]
Digital privacy has become an essential component of information and communications technology (ICT) systems.
There is still a gap in the digital privacy protection levels available for users.
This paper studies the digital privacy divide (DPD) problem in ICT systems.
arXiv Detail & Related papers (2022-01-20T00:59:53Z) - What Stops Learning-based 3D Registration from Working in the Real
World? [53.68326201131434]
This work identifies the sources of 3D point cloud registration failures, analyze the reasons behind them, and propose solutions.
Ultimately, this translates to a best-practice 3D registration network (BPNet), constituting the first learning-based method able to handle previously-unseen objects in real-world data.
Our model generalizes to real data without any fine-tuning, reaching an accuracy of up to 67% on point clouds of unseen objects obtained with a commercial sensor.
arXiv Detail & Related papers (2021-11-19T19:24:27Z) - Explainable Patterns: Going from Findings to Insights to Support Data
Analytics Democratization [60.18814584837969]
We present Explainable Patterns (ExPatt), a new framework to support lay users in exploring and creating data storytellings.
ExPatt automatically generates plausible explanations for observed or selected findings using an external (textual) source of information.
arXiv Detail & Related papers (2021-01-19T16:13:44Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z) - GDPR: When the Right to Access Personal Data Becomes a Threat [63.732639864601914]
We examine more than 300 data controllers performing for each of them a request to access personal data.
We find that 50.4% of the data controllers that handled the request, have flaws in the procedure of identifying the users.
With the undesired and surprising result that, in its present deployment, has actually decreased the privacy of the users of web services.
arXiv Detail & Related papers (2020-05-04T22:01:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.