A Comparative Study of Sequence Classification Models for Privacy Policy
Coverage Analysis
- URL: http://arxiv.org/abs/2003.04972v1
- Date: Wed, 12 Feb 2020 21:46:22 GMT
- Title: A Comparative Study of Sequence Classification Models for Privacy Policy
Coverage Analysis
- Authors: Zachary Lindner
- Abstract summary: Privacy policies are legal documents that describe how a website will collect, use, and distribute a user's data.
Our solution is to provide users with a coverage analysis of a given website's privacy policy using a wide range of classical machine learning and deep learning techniques.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Privacy policies are legal documents that describe how a website will
collect, use, and distribute a user's data. Unfortunately, such documents are
often overly complicated and filled with legal jargon; making it difficult for
users to fully grasp what exactly is being collected and why. Our solution to
this problem is to provide users with a coverage analysis of a given website's
privacy policy using a wide range of classical machine learning and deep
learning techniques. Given a website's privacy policy, the classifier
identifies the associated data practice for each logical segment. These data
practices/labels are taken directly from the OPP-115 corpus. For example, the
data practice "Data Retention" refers to how long a website stores a user's
information. The coverage analysis allows users to determine how many of the
ten possible data practices are covered, along with identifying the sections
that correspond to the data practices of particular interest.
Related papers
- Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - Automated Detection and Analysis of Data Practices Using A Real-World
Corpus [20.4572759138767]
We propose an automated approach to identify and visualize data practices within privacy policies at different levels of detail.
Our approach accurately matches data practice descriptions with policy excerpts, facilitating the presentation of simplified privacy information to users.
arXiv Detail & Related papers (2024-02-16T18:51:40Z) - FedDMF: Privacy-Preserving User Attribute Prediction using Deep Matrix
Factorization [1.9181612035055007]
We propose a novel algorithm for predicting user attributes without requiring user matching.
Our approach involves training deep matrix factorization models on different clients and sharing only attribute item vectors.
This allows us to predict user attributes without sharing the user vectors themselves.
arXiv Detail & Related papers (2023-12-24T06:49:00Z) - Privacy-Aware Document Visual Question Answering [47.89754310347398]
Document Visual Question Answering (DocVQA) is a fast growing branch of document understanding.
Despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees.
We highlight privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions.
arXiv Detail & Related papers (2023-12-15T06:30:55Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - PolicyGPT: Automated Analysis of Privacy Policies with Large Language
Models [41.969546784168905]
In practical use, users tend to click the Agree button directly rather than reading them carefully.
This practice exposes users to risks of privacy leakage and legal issues.
Recently, the advent of Large Language Models (LLM) such as ChatGPT and GPT-4 has opened new possibilities for text analysis.
arXiv Detail & Related papers (2023-09-19T01:22:42Z) - Transparency in App Analytics: Analyzing the Collection of User
Interaction Data [0.0]
We conducted an analysis of the top 20 analytic libraries for Android apps to identify common practices of interaction data collection.
We developed a standardized collection claim template for summarizing an app's data collection practices.
arXiv Detail & Related papers (2023-06-20T11:01:27Z) - Protecting User Privacy in Online Settings via Supervised Learning [69.38374877559423]
We design an intelligent approach to online privacy protection that leverages supervised learning.
By detecting and blocking data collection that might infringe on a user's privacy, we can restore a degree of digital privacy to the user.
arXiv Detail & Related papers (2023-04-06T05:20:16Z) - Intent Classification and Slot Filling for Privacy Policies [34.606121042708864]
PolicyIE is a corpus consisting of 5,250 intent and 11,788 slot annotations spanning 31 privacy policies of websites and mobile applications.
We present two alternative neural approaches as baselines: (1) formulating intent classification and slot filling as a joint sequence tagging and (2) modeling them as a sequence-to-sequence learning task.
arXiv Detail & Related papers (2021-01-01T00:44:41Z) - PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies.
We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.