Step-Back Profiling: Distilling User History for Personalized Scientific Writing
- URL: http://arxiv.org/abs/2406.14275v2
- Date: Thu, 11 Jul 2024 07:29:12 GMT
- Title: Step-Back Profiling: Distilling User History for Personalized Scientific Writing
- Authors: Xiangru Tang, Xingyao Zhang, Yanjun Shao, Jie Wu, Yilun Zhao, Arman Cohan, Ming Gong, Dongmei Zhang, Mark Gerstein,
- Abstract summary: Large language models (LLM) excel at a variety of natural language processing tasks, yet they struggle to generate personalized content for individuals.
We introduce STEP-BACK PROFILING to personalize LLMs by distilling user history into concise profiles.
Our approach outperforms the baselines by up to 3.6 points on the general personalization benchmark.
- Score: 50.481041470669766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLM) excel at a variety of natural language processing tasks, yet they struggle to generate personalized content for individuals, particularly in real-world scenarios like scientific writing. Addressing this challenge, we introduce STEP-BACK PROFILING to personalize LLMs by distilling user history into concise profiles, including essential traits and preferences of users. To conduct the experiments, we construct a Personalized Scientific Writing (PSW) dataset to study multi-user personalization. PSW requires the models to write scientific papers given specialized author groups with diverse academic backgrounds. As for the results, we demonstrate the effectiveness of capturing user characteristics via STEP-BACK PROFILING for collaborative writing. Moreover, our approach outperforms the baselines by up to 3.6 points on the general personalization benchmark (LaMP), including 7 personalization LLM tasks. Our ablation studies validate the contributions of different components in our method and provide insights into our task definition. Our dataset and code are available at \url{https://github.com/gersteinlab/step-back-profiling}.
Related papers
- Optimizing Data Delivery: Insights from User Preferences on Visuals, Tables, and Text [59.68239795065175]
We conduct a user study where users are shown a question and asked what they would prefer to see.
We use the data to establish that a user's personal traits does influence the data outputs that they prefer.
arXiv Detail & Related papers (2024-11-12T00:24:31Z) - Personalization of Large Language Models: A Survey [131.00650432814268]
Personalization of Large Language Models (LLMs) has recently become increasingly important with a wide range of applications.
Most existing works on personalized LLMs have focused either entirely on (a) personalized text generation or (b) leveraging LLMs for personalization-related downstream applications, such as recommendation systems.
We introduce a taxonomy for personalized LLM usage and summarizing the key differences and challenges.
arXiv Detail & Related papers (2024-10-29T04:01:11Z) - PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models [3.516029765200171]
We propose a high-quality, personalized, manually annotated abstractive summarization dataset called PersonalSum.
This dataset is the first to investigate whether the focus of public readers differs from the generic summaries generated by Large Language Models.
arXiv Detail & Related papers (2024-10-04T20:12:39Z) - Guided Profile Generation Improves Personalization with LLMs [3.2685922749445617]
In modern commercial systems, including Recommendation, Ranking, and E-Commerce platforms, there is a trend towards incorporating Personalization context as input into Large Language Models (LLMs)
We propose Guided Profile Generation (GPG), a general method designed to generate personal profiles in natural language.
Our experimental results show that GPG improves LLM's personalization ability across different tasks, for example, it increases 37% accuracy in predicting personal preference compared to directly feeding the LLMs with raw personal context.
arXiv Detail & Related papers (2024-09-19T21:29:56Z) - LLMs + Persona-Plug = Personalized LLMs [41.60364110693824]
Personalization plays a critical role in numerous language tasks and applications, since users with the same requirements may prefer diverse outputs based on their individual interests.
This has led to the development of various personalized approaches aimed at adapting large language models (LLMs) to generate customized outputs aligned with user preferences.
We propose a novel personalized LLM model, ours. It constructs a user-specific embedding for each individual by modeling all her historical contexts through a lightweight plug-in user embedder module.
arXiv Detail & Related papers (2024-09-18T11:54:45Z) - Evaluating Large Language Model based Personal Information Extraction and Countermeasures [63.91918057570824]
Large language model (LLM) can be misused by attackers to accurately extract various personal information from personal profiles.
LLM outperforms conventional methods at such extraction.
prompt injection can mitigate such risk to a large extent and outperforms conventional countermeasures.
arXiv Detail & Related papers (2024-08-14T04:49:30Z) - PerPLM: Personalized Fine-tuning of Pretrained Language Models via
Writer-specific Intermediate Learning and Prompts [16.59511985633798]
Pretrained language models (PLMs) are powerful tools for capturing context.
PLMs are typically pretrained and fine-tuned for universal use across different writers.
This study aims to improve the accuracy of text understanding tasks by personalizing the fine-tuning of PLMs for specific writers.
arXiv Detail & Related papers (2023-09-14T14:03:48Z) - LaMP: When Large Language Models Meet Personalization [35.813652110400064]
This paper highlights the importance of personalization in large language models and introduces the LaMP benchmark.
LaMP is a novel benchmark for training and evaluating language models for producing personalized outputs.
arXiv Detail & Related papers (2023-04-22T13:42:04Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.