Related papers: Transparency in App Analytics: Analyzing the Collection of User Interaction Data

Transparency in App Analytics: Analyzing the Collection of User Interaction Data

URL: http://arxiv.org/abs/2306.11447v1
Date: Tue, 20 Jun 2023 11:01:27 GMT
Title: Transparency in App Analytics: Analyzing the Collection of User Interaction Data
Authors: Feiyang Tang, Bjarte M. {\O}stvold
Abstract summary: We conducted an analysis of the top 20 analytic libraries for Android apps to identify common practices of interaction data collection. We developed a standardized collection claim template for summarizing an app's data collection practices.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The rise of mobile apps has brought greater convenience and many options for users. However, many apps use analytics services to collect a wide range of user interaction data, with privacy policies often failing to reveal the types of interaction data collected or the extent of the data collection practices. This lack of transparency potentially breaches data protection laws and also undermines user trust. We conducted an analysis of the top 20 analytic libraries for Android apps to identify common practices of interaction data collection and used this information to develop a standardized collection claim template for summarizing an app's data collection practices wrt. user interaction data. We selected the top 100 apps from popular categories on Google Play and used automatic static analysis to extract collection evidence from their data collection implementations. Our analysis found that a significant majority of these apps actively collected interaction data from UI types such as View (89%), Button (76%), and Textfield (63%), highlighting the pervasiveness of user interaction data collection. By comparing the collection evidence to the claims derived from privacy policy analysis, we manually fact-checked the completeness and accuracy of these claims for the top 10 apps. We found that, except for one app, they all failed to declare all types of interaction data they collect and did not specify some of the collection techniques used.

Related papers

SessionIntentBench: A Multi-task Inter-session Intention-shift Modeling Benchmark for E-commerce Customer Behavior Understanding [64.45047674586671]
We introduce the concept of an intention tree and propose a dataset curation pipeline.<n>We construct a sibling multimodal benchmark, SessionIntentBench, that evaluates L(V)LMs' capability on understanding inter-session intention shift.<n>With 1,952,177 intention entries, 1,132,145 session intention trajectories, and 13,003,664 available tasks mined using 10,905 sessions, we provide a scalable way to exploit the existing session data.
arXiv Detail & Related papers (2025-07-27T09:04:17Z)
Mind the Gap! Static and Interactive Evaluations of Large Audio Models [55.87220295533817]
Large Audio Models (LAMs) are designed to power voice-native experiences. This study introduces an interactive approach to evaluate LAMs and collect 7,500 LAM interactions from 484 participants.
arXiv Detail & Related papers (2025-02-21T20:29:02Z)
Do Android App Developers Accurately Report Collection of Privacy-Related Data? [5.863391019411233]
European Union's General Protection Regulation requires vendors to faithfully disclose their apps collect data. Many Android apps use third-party code for same information is not readily available. We first expose a multi-layered definition of privacy-related data correctly report collection in Android apps. We then create a dataset of privacy-sensitive data classes that may be used as input by an Android app.
arXiv Detail & Related papers (2024-09-06T10:05:45Z)
Data Exposure from LLM Apps: An In-depth Investigation of OpenAI's GPTs [17.433387980578637]
This paper aims to bring transparency in data practices of LLM apps. We study OpenAI's GPT app ecosystem. We find that Actions collect expansive data about users, including sensitive information prohibited by OpenAI, such as passwords.
arXiv Detail & Related papers (2024-08-23T17:42:06Z)
User Interaction Data in Apps: Comparing Policy Claims to Implementations [0.0]
We analyzed the top 100 apps across diverse categories using static analysis methods to evaluate the alignment between policy claims and implemented data collection techniques. Our findings highlight the lack of transparency in data collection and the associated risk of re-identification, raising concerns about user privacy and trust.
arXiv Detail & Related papers (2023-12-05T12:11:11Z)
Going beyond research datasets: Novel intent discovery in the industry setting [60.90117614762879]
This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform. We show the benefit of pre-training language models on in-domain data: both self-supervised and with weak supervision. We also devise the best method to utilize the conversational structure (i.e., question and answer) of real-life datasets during fine-tuning for clustering tasks, which we call Conv.
arXiv Detail & Related papers (2023-05-09T14:21:29Z)
Federated Privacy-preserving Collaborative Filtering for On-Device Next App Prediction [52.16923290335873]
We propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. We modify the structure of the classical matrix factorization model and update the training procedure to sequential learning. One more ingredient of the proposed approach is a new privacy mechanism that guarantees the protection of the sent data from the users to the remote server.
arXiv Detail & Related papers (2023-02-05T10:29:57Z)
Resolving Uncertain Case Identifiers in Interaction Logs: A User Study [0.4014524824655105]
We propose a neural network-based technique to determine a case notion for click data. We validate its efficacy through a user study based on the segmented event log resulting from interaction data of a mobility sharing company.
arXiv Detail & Related papers (2022-11-21T16:13:04Z)
Finding Facial Forgery Artifacts with Parts-Based Detectors [73.08584805913813]
We design a series of forgery detection systems that each focus on one individual part of the face. We use these detectors to perform detailed empirical analysis on the FaceForensics++, Celeb-DF, and Facebook Deepfake Detection Challenge datasets.
arXiv Detail & Related papers (2021-09-21T16:18:45Z)
Simple multi-dataset detection [83.9604523643406]
We present a simple method for training a unified detector on multiple large-scale datasets. We show how to automatically integrate dataset-specific outputs into a common semantic taxonomy. Our approach does not require manual taxonomy reconciliation.
arXiv Detail & Related papers (2021-02-25T18:55:58Z)
Dynamic Graph Collaborative Filtering [64.87765663208927]
Dynamic recommendation is essential for recommender systems to provide real-time predictions based on sequential data. Here we propose Dynamic Graph Collaborative Filtering (DGCF), a novel framework leveraging dynamic graphs to capture collaborative and sequential relations. Our approach achieves higher performance when the dataset contains less action repetition, indicating the effectiveness of integrating dynamic collaborative information.
arXiv Detail & Related papers (2021-01-08T04:16:24Z)
Disentangled Graph Collaborative Filtering [100.26835145396782]
Disentangled Graph Collaborative Filtering (DGCF) is a new model for learning informative representations of users and items from interaction data. By modeling a distribution over intents for each user-item interaction, we iteratively refine the intent-aware interaction graphs and representations. DGCF achieves significant improvements over several state-of-the-art models like NGCF, DisenGCN, and MacridVAE.
arXiv Detail & Related papers (2020-07-03T15:37:25Z)
A Comparative Study of Sequence Classification Models for Privacy Policy Coverage Analysis [0.0]
Privacy policies are legal documents that describe how a website will collect, use, and distribute a user's data. Our solution is to provide users with a coverage analysis of a given website's privacy policy using a wide range of classical machine learning and deep learning techniques.
arXiv Detail & Related papers (2020-02-12T21:46:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.