Instruction-aware User Embedding via Synergistic Language and Representation Modeling
- URL: http://arxiv.org/abs/2510.11016v1
- Date: Mon, 13 Oct 2025 05:15:34 GMT
- Title: Instruction-aware User Embedding via Synergistic Language and Representation Modeling
- Authors: Ziyi Gao, Yike Xu, Jiahao Yuan, Baokun Wang, Jinyong Wen, Xiaotong Lin, Yun Liu, Xing Fu, Yu Cheng, Yongchao Liu, Weiqiang Wang, Zhongle Xie,
- Abstract summary: InstructUE is an instruction-aware user embedding foundation model that generates general and instruction-aware user representations.<n>We show that InstructUE significantly outperforms existing methods across multiple domains including user prediction, marketing, and recommendation scenarios.
- Score: 28.30329175937291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: User representation modeling has become increasingly crucial for personalized applications, yet existing approaches struggle with generalizability across domains and sensitivity to noisy behavioral signals. We present InstructUE, an instruction-aware user embedding foundation model that leverages large language models (LLMs) to generate general and instruction-aware user representations. InstructUE introduces a multi-encoder architecture with a lightweight adapter that efficiently processes heterogeneous data from six different sources while preserving their structural characteristics. Additionally, it proposes a novel contrastive-autoregressive training framework that bridges language and representation spaces through a curated UserQA dataset. The contrastive-autoregressive training framework simultaneously leverages autoregressive learning to capture domain knowledge in language space and contrastive learning to align user-text embeddings in representation space, thereby enhancing the instruction-awareness and noise-robustness of user embeddings. Through extensive experiments on real-world applications, we demonstrate that InstructUE significantly outperforms existing methods across multiple domains including user prediction, marketing, and recommendation scenarios. Our results show that instruction-aware user modeling can effectively achieve instruction-guided denoising of user information in specific scenarios, paving the way for more generalizable and robust user representation learning.
Related papers
- RecoWorld: Building Simulated Environments for Agentic Recommender Systems [55.979427290369216]
We present RecoWorld, a blueprint for building simulated environments tailored to agentic recommender systems.<n>A user simulator reviews recommended items, updates its mindset, and when sensing potential user disengagement, generates reflective instructions.<n>The agentic recommender adapts its recommendations by incorporating these user instructions and reasoning traces, creating a dynamic feedback loop.
arXiv Detail & Related papers (2025-09-12T16:44:34Z) - Spotlight Your Instructions: Instruction-following with Dynamic Attention Steering [5.160554120418462]
We present an inference-time method that enables users to emphasize specific parts of their prompt by steering the model's attention toward them.<n>Unlike prior approaches, we dynamically update the proportion of model attention given to the user-specified parts--ensuring improved instruction following without performance degradation.
arXiv Detail & Related papers (2025-05-17T14:28:53Z) - ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction [15.220300812671494]
We introduce ScreenLLM, a set of multimodal large language models (MLLMs) tailored for advanced UI understanding and action prediction.<n>Our work lays the foundation for scalable, robust, and intelligent GUI agents that enhance user interaction in diverse software environments.
arXiv Detail & Related papers (2025-03-26T20:41:24Z) - ASIDE: Architectural Separation of Instructions and Data in Language Models [87.16417239344285]
ASIDE allows language models to clearly separate instructions and data at the level of embeddings.<n>We demonstrate experimentally across a range of models, instruction-tuning LLMs with ASIDE leads to highly increased instruction-data separation without a loss in model utility.<n>We provide insights into the mechanism underlying our method through an analysis of the model representations.
arXiv Detail & Related papers (2025-03-13T17:17:17Z) - Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.<n>Our findings are synthesized in Flex (Fly lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.<n>We demonstrate the effectiveness of this approach on a quadrotor fly-to-target task, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Language Representations Can be What Recommenders Need: Findings and Potentials [57.90679739598295]
We show that item representations, when linearly mapped from advanced LM representations, yield superior recommendation performance.<n>This outcome suggests the possible homomorphism between the advanced language representation space and an effective item representation space for recommendation.<n>Our findings highlight the connection between language modeling and behavior modeling, which can inspire both natural language processing and recommender system communities.
arXiv Detail & Related papers (2024-07-07T17:05:24Z) - Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control [73.6361029556484]
Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs.
We consider pre-trained text-to-image diffusion models, which are explicitly optimized to generate images from text prompts.
We show that Stable Control Representations enable learning policies that exhibit state-of-the-art performance on OVMM, a difficult open-vocabulary navigation benchmark.
arXiv Detail & Related papers (2024-05-09T15:39:54Z) - Generalized User Representations for Transfer Learning [6.953653891411339]
We present a novel framework for user representation in large-scale recommender systems.
Our approach employs a two-stage methodology combining representation learning and transfer learning.
We show how the proposed framework can significantly reduce infrastructure costs compared to alternative approaches.
arXiv Detail & Related papers (2024-03-01T15:05:21Z) - RecExplainer: Aligning Large Language Models for Explaining Recommendation Models [50.74181089742969]
Large language models (LLMs) have demonstrated remarkable intelligence in understanding, reasoning, and instruction following.
This paper presents the initial exploration of using LLMs as surrogate models to explain black-box recommender models.
To facilitate an effective alignment, we introduce three methods: behavior alignment, intention alignment, and hybrid alignment.
arXiv Detail & Related papers (2023-11-18T03:05:43Z) - Representation Learning with Large Language Models for Recommendation [33.040389989173825]
We propose a model-agnostic framework RLMRec to enhance recommenders with large language models (LLMs)empowered representation learning.<n>RLMRec incorporates auxiliary textual signals, develops a user/item profiling paradigm empowered by LLMs, and aligns the semantic space of LLMs with the representation space of collaborative relational signals.
arXiv Detail & Related papers (2023-10-24T15:51:13Z) - SimCURL: Simple Contrastive User Representation Learning from Command
Sequences [22.92215383896495]
We propose SimCURL, a contrastive self-supervised deep learning framework that learns user representation from unlabeled command sequences.
We train and evaluate our method on a real-world command sequence dataset of more than half a billion commands.
arXiv Detail & Related papers (2022-07-29T16:06:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.