Breaking the illusion: Automated Reasoning of GDPR Consent Violations
- URL: http://arxiv.org/abs/2512.22789v1
- Date: Sun, 28 Dec 2025 05:22:00 GMT
- Title: Breaking the illusion: Automated Reasoning of GDPR Consent Violations
- Authors: Ying Li, Wenjun Qiu, Faysal Hossain Shezan, Kunlin Cai, Michelangelo van Dam, Lisa Austin, David Lie, Yuan Tian,
- Abstract summary: We present Cosmic, a novel automated framework for detecting consent-related privacy violations in web forms.<n>Cosmic detects 3384 violations on 94.1% of consent forms, covering key principles such as freely given consent purpose disclosure, and withdrawal options.
- Score: 9.488261532697814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent privacy regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) have established legal requirements for obtaining user consent regarding the collection, use, and sharing of personal data. These regulations emphasize that consent must be informed, freely given, specific, and unambiguous. However, there are still many violations, which highlight a gap between legal expectations and actual implementation. Consent mechanisms embedded in functional web forms across websites play a critical role in ensuring compliance with data protection regulations such as the GDPR and CCPA, as well as in upholding user autonomy and trust. However, current research has primarily focused on cookie banners and mobile app dialogs. These forms are diverse in structure, vary in legal basis, and are often difficult to locate or evaluate, creating a significant challenge for automated consent compliance auditing. In this work, we present Cosmic, a novel automated framework for detecting consent-related privacy violations in web forms. We evaluate our developed tool for auditing consent compliance in web forms, across 5,823 websites and 3,598 forms. Cosmic detects 3,384 violations on 94.1% of consent forms, covering key GDPR principles such as freely given consent, purpose disclosure, and withdrawal options. It achieves 98.6% and 99.1% TPR for consent and violation detection, respectively, demonstrating high accuracy and real-world applicability.
Related papers
- ReaKase-8B: Legal Case Retrieval via Knowledge and Reasoning Representations with LLMs [37.688405624086315]
A novel ReaKase-8B framework is proposed to leverage extracted legal facts, legal issues, legal relation triplets and legal reasoning for effective legal case retrieval.<n>Experiments on two benchmark datasets from COLIEE 2022 and COLIEE 2023 demonstrate that our knowledge and reasoning augmented embeddings substantially improve retrieval performance.
arXiv Detail & Related papers (2025-10-30T06:35:36Z) - EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law [39.146761527401424]
EU-Agent-Bench is a verifiable benchmark that evaluates an agent's alignment with EU legal norms.<n>Our benchmark spans scenarios across several categories, including data protection, bias/discrimination, and scientific integrity.<n>We release a public preview set for the research community, while holding out a private test set to prevent data contamination.
arXiv Detail & Related papers (2025-10-24T14:48:10Z) - "Nobody should control the end user": Exploring Privacy Perspectives of Indian Internet Users in Light of DPDPA [2.6885436577568904]
We explore Indian Internet users' awareness and perceptions of cookie banners, online privacy, and privacy regulations, especially in light of the newly passed DPDPA.<n>Our findings reveal that privacy-conscious users often lack consistent awareness of privacy mechanisms.<n>Our study highlights the need for clearer communication regarding the DPDPA, user-centric consent mechanisms, and policy refinements to enhance data privacy practices in India.
arXiv Detail & Related papers (2025-08-25T12:22:25Z) - Context Reasoner: Incentivizing Reasoning Capability for Contextualized Privacy and Safety Compliance via Reinforcement Learning [53.92712851223158]
We formulate safety and privacy issues into contextualized compliance problems following the Contextual Integrity (CI) theory.<n>Under the CI framework, we align our model with three critical regulatory standards: EU AI Act, and HIPAA.<n>We employ reinforcement learning (RL) with a rule-based reward to incentivize contextual reasoning capabilities while enhancing compliance with safety and privacy norms.
arXiv Detail & Related papers (2025-05-20T16:40:09Z) - Incorporating Legal Structure in Retrieval-Augmented Generation: A Case Study on Copyright Fair Use [44.99833362998488]
This paper presents a domain-specific implementation of Retrieval-Augmented Generation tailored to the Fair Use Doctrine in U.S. copyright law.<n>Motivated by the increasing prevalence of DMCA takedowns and the lack of accessible legal support for content creators, we propose a structured approach that combines semantic search with legal knowledge graphs and court citation networks to improve retrieval quality and reasoning reliability.
arXiv Detail & Related papers (2025-05-04T15:53:49Z) - A Cross-Country Analysis of GDPR Cookie Banners and Flexible Methods for Scraping Them [6.533686617147407]
We examine the top 10,000 websites across 31 countries under the ePrivacy Directive and consent-observatory.eu.<n>We show that 67% of websites use consent interfaces, but only 15% are minimally compliant, mostly because they lack a reject option.<n>There is little evidence that regulators' guidance and fines have impacted compliance rates, but 18% of compliance variance is explained by CMPs.
arXiv Detail & Related papers (2025-03-25T13:44:26Z) - AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction [56.797874973414636]
AnnoCaseLaw is a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases.<n>Our dataset lays the groundwork for more human-aligned, explainable Legal Judgment Prediction models.<n>Results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult.
arXiv Detail & Related papers (2025-02-28T19:14:48Z) - Measuring Compliance of Consent Revocation on the Web [6.397084532913525]
No prior work has studied consent revocation on the Web.<n> 19.87% of websites make it difficult for users to revoke consent throughout different interfaces.<n>20.5% of websites require more effort than acceptance, and 2.48% do not provide consent revocation at all.<n>57.5% websites do not delete the cookies after consent revocation enabling continuous illegal processing of users' data.
arXiv Detail & Related papers (2024-11-23T02:23:01Z) - SoK: The Gap Between Data Rights Ideals and Reality [42.769107967436945]
Do rights-based privacy laws effectively empower individuals over their data?<n>This paper scrutinizes these approaches by reviewing empirical studies, news articles, and blog posts.
arXiv Detail & Related papers (2023-12-03T21:52:51Z) - NLP-based Automated Compliance Checking of Data Processing Agreements
against GDPR [9.022562906627991]
We propose an automated solution to check compliance of a given DPA against the "shall" requirements.
Our approach correctly finds 618 out of 750 genuine violations while raising 76 false violations, and further correctly identifies 524 satisfied requirements.
arXiv Detail & Related papers (2022-09-20T13:50:58Z) - Having your Privacy Cake and Eating it Too: Platform-supported Auditing
of Social Media Algorithms for Public Interest [70.02478301291264]
Social media platforms curate access to information and opportunities, and so play a critical role in shaping public discourse.
Prior studies have used black-box methods to show that these algorithms can lead to biased or discriminatory outcomes.
We propose a new method for platform-supported auditing that can meet the goals of the proposed legislation.
arXiv Detail & Related papers (2022-07-18T17:32:35Z) - AI-enabled Automation for Completeness Checking of Privacy Policies [7.707284039078785]
In Europe, privacy policies are subject to compliance with the General Data Protection Regulation.
In this paper, we propose AI-based automation for completeness checking privacy policies.
arXiv Detail & Related papers (2021-06-10T12:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.