PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data
- URL: http://arxiv.org/abs/2502.20616v1
- Date: Fri, 28 Feb 2025 00:43:35 GMT
- Title: PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data
- Authors: Juntao Tan, Liangwei Yang, Zuxin Liu, Zhiwei Liu, Rithesh Murthy, Tulika Manoj Awalgaonkar, Jianguo Zhang, Weiran Yao, Ming Zhu, Shirley Kokane, Silvio Savarese, Huan Wang, Caiming Xiong, Shelby Heinecke,
- Abstract summary: Personalization is critical in AI assistants, particularly in the context of private AI models that work with individual users.<n>Due to the sensitive nature of such data, there are no publicly available datasets that allow us to assess an AI model's ability to understand users.<n>We introduce a synthetic data generation pipeline that creates diverse, realistic user profiles and private documents simulating human activities.
- Score: 76.21047984886273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personalization is critical in AI assistants, particularly in the context of private AI models that work with individual users. A key scenario in this domain involves enabling AI models to access and interpret a user's private data (e.g., conversation history, user-AI interactions, app usage) to understand personal details such as biographical information, preferences, and social connections. However, due to the sensitive nature of such data, there are no publicly available datasets that allow us to assess an AI model's ability to understand users through direct access to personal information. To address this gap, we introduce a synthetic data generation pipeline that creates diverse, realistic user profiles and private documents simulating human activities. Leveraging this synthetic data, we present PersonaBench, a benchmark designed to evaluate AI models' performance in understanding personal information derived from simulated private user data. We evaluate Retrieval-Augmented Generation (RAG) pipelines using questions directly related to a user's personal information, supported by the relevant private documents provided to the models. Our results reveal that current retrieval-augmented AI models struggle to answer private questions by extracting personal information from user documents, highlighting the need for improved methodologies to enhance personalization capabilities in AI.
Related papers
- Reimagining Personal Data: Unlocking the Potential of AI-Generated Images in Personal Data Meaning-Making [7.8651914932018405]
Image-generative AI provides new opportunities to transform personal data into alternative visual forms.<n>In this paper, we illustrate the potential of AI-generated images in facilitating meaningful engagement with personal data.
arXiv Detail & Related papers (2025-02-26T05:50:57Z) - Personalized Graph-Based Retrieval for Large Language Models [51.7278897841697]
We propose a framework that leverages user-centric knowledge graphs to enrich personalization.<n>By directly integrating structured user knowledge into the retrieval process and augmenting prompts with user-relevant context, PGraph enhances contextual understanding and output quality.<n>We also introduce the Personalized Graph-based Benchmark for Text Generation, designed to evaluate personalized text generation tasks in real-world settings where user history is sparse or unavailable.
arXiv Detail & Related papers (2025-01-04T01:46:49Z) - Bridging Personalization and Control in Scientific Personalized Search [53.7152408217116]
We introduce a model for personalized search that enables users to control personalized rankings proactively.
Our model, CtrlCE, is a novel cross-encoder model augmented with an editable memory built from users' historical interactions.
arXiv Detail & Related papers (2024-11-05T03:55:25Z) - Survey of User Interface Design and Interaction Techniques in Generative AI Applications [79.55963742878684]
We aim to create a compendium of different user-interaction patterns that can be used as a reference for designers and developers alike.
We also strive to lower the entry barrier for those attempting to learn more about the design of generative AI applications.
arXiv Detail & Related papers (2024-10-28T23:10:06Z) - CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data [7.357348564300953]
CI-Bench is a comprehensive benchmark for evaluating the ability of AI assistants to protect personal information during model inference.
We present a novel, scalable, multi-step data pipeline for generating natural communications, including dialogues and emails.
We formulate and evaluate a naive AI assistant to demonstrate the need for further study and careful training towards personal assistant tasks.
arXiv Detail & Related papers (2024-09-20T21:14:36Z) - Chatting Up Attachment: Using LLMs to Predict Adult Bonds [0.0]
We use GPT-4 and Claude 3 Opus to create agents that simulate adults with varying profiles, childhood memories, and attachment styles.
We evaluate our models using a transcript dataset from 9 humans who underwent the same interview protocol, analyzed and labeled by mental health professionals.
Our findings indicate that training the models using only synthetic data achieves performance comparable to training the models on human data.
arXiv Detail & Related papers (2024-08-31T04:29:19Z) - Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - Participatory Personalization in Classification [8.234011679612436]
We introduce a family of classification models, called participatory systems, that let individuals opt into personalization at prediction time.
We conduct a comprehensive empirical study of participatory systems in clinical prediction tasks, benchmarking them with common approaches for personalization and imputation.
Our results demonstrate that participatory systems can facilitate and inform consent while improving performance and data use across all groups who report personal data.
arXiv Detail & Related papers (2023-02-08T04:24:19Z) - Privacy-Preserving Machine Learning for Collaborative Data Sharing via
Auto-encoder Latent Space Embeddings [57.45332961252628]
Privacy-preserving machine learning in data-sharing processes is an ever-critical task.
This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data.
arXiv Detail & Related papers (2022-11-10T17:36:58Z) - Differentially Private Language Models for Secure Data Sharing [19.918137395199224]
In this paper, we show how to train a generative language model in a differentially private manner and consequently sampling data from it.
Using natural language prompts and a new prompt-mismatch loss, we are able to create highly accurate and fluent textual datasets.
We perform thorough experiments indicating that our synthetic datasets do not leak information from our original data and are of high language quality.
arXiv Detail & Related papers (2022-10-25T11:12:56Z) - Towards Personalized Answer Generation in E-Commerce via
Multi-Perspective Preference Modeling [62.049330405736406]
Product Question Answering (PQA) on E-Commerce platforms has attracted increasing attention as it can act as an intelligent online shopping assistant.
It is insufficient to provide the same "completely summarized" answer to all customers, since many customers are more willing to see personalized answers with customized information only for themselves.
We propose a novel multi-perspective user preference model for generating personalized answers in PQA.
arXiv Detail & Related papers (2021-12-27T07:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.