Related papers: Exploring User Privacy Awareness on GitHub: An Empirical Study

Exploring User Privacy Awareness on GitHub: An Empirical Study

URL: http://arxiv.org/abs/2409.04048v2
Date: Tue, 10 Sep 2024 09:35:53 GMT
Title: Exploring User Privacy Awareness on GitHub: An Empirical Study
Authors: Costanza Alfieri, Juri Di Rocco, Paola Inverardi, Phuong T. Nguyen,
Abstract summary: GitHub provides developers with a practical way to distribute source code and collaborate on common projects. To enhance account security and privacy, GitHub allows its users to manage access permissions, review audit logs, and enable two-factor authentication. Despite the endless effort, the platform still faces various issues related to the privacy of its users.
Score: 5.822284390235265
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: GitHub provides developers with a practical way to distribute source code and collaboratively work on common projects. To enhance account security and privacy, GitHub allows its users to manage access permissions, review audit logs, and enable two-factor authentication. However, despite the endless effort, the platform still faces various issues related to the privacy of its users. This paper presents an empirical study delving into the GitHub ecosystem. Our focus is on investigating the utilization of privacy settings on the platform and identifying various types of sensitive information disclosed by users. Leveraging a dataset comprising 6,132 developers, we report and analyze their activities by means of comments on pull requests. Our findings indicate an active engagement by users with the available privacy settings on GitHub. Notably, we observe the disclosure of different forms of private information within pull request comments. This observation has prompted our exploration into sensitivity detection using a large language model and BERT, to pave the way for a personalized privacy assistant. Our work provides insights into the utilization of existing privacy protection tools, such as privacy settings, along with their inherent limitations. Essentially, we aim to advance research in this field by providing both the motivation for creating such privacy protection tools and a proposed methodology for personalizing them.

Related papers

Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences [80.63946798650653]
We explore how users can stay in control of their data by using privacy profiles.<n>We build a framework where a local model uses these instructions to rewrite queries.<n>To support this research, we introduce a multilingual dataset of real user queries to mark private content.
arXiv Detail & Related papers (2025-07-07T18:22:55Z)
Differentially Private Synthetic Data Release for Topics API Outputs [63.79476766779742]
We focus on one Privacy-Preserving Ads API: the Topics API, part of Google Chrome's Privacy Sandbox.<n>We generate a differentially-private dataset that closely matches the re-identification risk properties of the real Topics API data.<n>We hope this will enable external researchers to analyze the API in-depth and replicate prior and future work on a realistic large-scale dataset.
arXiv Detail & Related papers (2025-06-30T13:46:57Z)
Which Code Statements Implement Privacy Behaviors in Android Applications? [5.723067425160506]
A "privacy behavior" in software is an action where the software uses personal information for a service or a feature, such as a website using location to provide content relevant to a user. We propose an approach to automatically detect privacy-relevant statements by fine-tuning three large language models with the data from the study.
arXiv Detail & Related papers (2025-03-03T22:20:01Z)
Privacy Bills of Materials: A Transparent Privacy Information Inventory for Collaborative Privacy Notice Generation in Mobile App Development [23.41168782020005]
We introduce PriBOM, a systematic software engineering approach to better capture and coordinate mobile app privacy information. PriBOM facilitates transparency-centric privacy documentation and specific privacy notice creation, enabling traceability and trackability of privacy practices.
arXiv Detail & Related papers (2025-01-02T08:14:52Z)
On the Differential Privacy and Interactivity of Privacy Sandbox Reports [78.85958224681858]
The Privacy Sandbox initiative from Google includes APIs for enabling privacy-preserving advertising functionalities. We provide an abstract model for analyzing the privacy of these APIs and show that they satisfy a formal DP guarantee.
arXiv Detail & Related papers (2024-12-22T08:22:57Z)
A Large-Scale Privacy Assessment of Android Third-Party SDKs [17.245330733308375]
Third-party Software Development Kits (SDKs) are widely adopted in Android app development. This convenience raises substantial concerns about unauthorized access to users' privacy-sensitive information. Our study offers a targeted analysis of user privacy protection among Android third-party SDKs.
arXiv Detail & Related papers (2024-09-16T15:44:43Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
{A New Hope}: Contextual Privacy Policies for Mobile Applications and An Approach Toward Automated Generation [19.578130824867596]
The aim of contextual privacy policies ( CPPs) is to fragment privacy policies into concise snippets, displaying them only within the corresponding contexts within the application's graphical user interfaces (GUIs) In this paper, we first formulate CPP in mobile application scenario, and then present a novel multimodal framework, named SeePrivacy, specifically designed to automatically generate CPPs for mobile applications. A human evaluation shows that 77% of the extracted privacy policy segments were perceived as well-aligned with the detected contexts.
arXiv Detail & Related papers (2024-02-22T13:32:33Z)
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively. Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z)
Mining User Privacy Concern Topics from App Reviews [10.776958968245589]
An increasing number of users are voicing their privacy concerns through app reviews on App stores. The main challenge of effectively mining privacy concerns from user reviews lies in the fact that reviews expressing privacy concerns are overridden by a large number of reviews expressing more generic themes and noisy content. In this work, we propose a novel automated approach to overcome that challenge.
arXiv Detail & Related papers (2022-12-19T08:07:27Z)
Privacy Explanations - A Means to End-User Trust [64.7066037969487]
We looked into how explainability might help to tackle this problem. We created privacy explanations that aim to help to clarify to end users why and for what purposes specific data is required. Our findings reveal that privacy explanations can be an important step towards increasing trust in software systems.
arXiv Detail & Related papers (2022-10-18T09:30:37Z)
SPAct: Self-supervised Privacy Preservation for Action Recognition [73.79886509500409]
Existing approaches for mitigating privacy leakage in action recognition require privacy labels along with the action labels from the video dataset. Recent developments of self-supervised learning (SSL) have unleashed the untapped potential of the unlabeled data. We present a novel training framework which removes privacy information from input video in a self-supervised manner without requiring privacy labels.
arXiv Detail & Related papers (2022-03-29T02:56:40Z)
Statistical Feature-based Personal Information Detection in Mobile Network Traffic [13.568975395946433]
In this paper, statistical features of personal information are designed to depict the occurrence patterns of personal information in the traffic. A detector is trained based on machine learning algorithms to discover potential personal information with similar patterns. As far as we know, this is the first work that detects personal information based on statistical features.
arXiv Detail & Related papers (2021-12-23T04:01:16Z)
PGLP: Customizable and Rigorous Location Privacy through Policy Graph [68.3736286350014]
We propose a new location privacy notion called PGLP, which provides a rich interface to release private locations with customizable and rigorous privacy guarantee. Specifically, we formalize a user's location privacy requirements using a textitlocation policy graph, which is expressive and customizable. Third, we design a private location trace release framework that pipelines the detection of location exposure, policy graph repair, and private trajectory release with customizable and rigorous location privacy.
arXiv Detail & Related papers (2020-05-04T04:25:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.