PreferThinker: Reasoning-based Personalized Image Preference Assessment
- URL: http://arxiv.org/abs/2511.00609v2
- Date: Mon, 10 Nov 2025 07:47:50 GMT
- Title: PreferThinker: Reasoning-based Personalized Image Preference Assessment
- Authors: Shengqi Xu, Xinpeng Zhou, Yabo Zhang, Ming Liu, Tao Liang, Tianyu Zhang, Yalong Bai, Zuxuan Wu, Wangmeng Zuo,
- Abstract summary: We propose a reasoning-based personalized image preference assessment framework.<n>It first predicts a user's preference profile from reference images.<n>It then provides interpretable, multi-dimensional scores and assessments of candidate images.
- Score: 83.66114370585976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Personalized image preference assessment aims to evaluate an individual user's image preferences by relying only on a small set of reference images as prior information. Existing methods mainly focus on general preference assessment, training models with large-scale data to tackle well-defined tasks such as text-image alignment. However, these approaches struggle to handle personalized preference because user-specific data are scarce and not easily scalable, and individual tastes are often diverse and complex. To overcome these challenges, we introduce a common preference profile that serves as a bridge across users, allowing large-scale user data to be leveraged for training profile prediction and capturing complex personalized preferences. Building on this idea, we propose a reasoning-based personalized image preference assessment framework that follows a \textit{predict-then-assess} paradigm: it first predicts a user's preference profile from reference images, and then provides interpretable, multi-dimensional scores and assessments of candidate images based on the predicted profile. To support this, we first construct a large-scale Chain-of-Thought (CoT)-style personalized assessment dataset annotated with diverse user preference profiles and high-quality CoT-style reasoning, enabling explicit supervision of structured reasoning. Next, we adopt a two-stage training strategy: a cold-start supervised fine-tuning phase to empower the model with structured reasoning capabilities, followed by reinforcement learning to incentivize the model to explore more reasonable assessment paths and enhance generalization. Furthermore, we propose a similarity-aware prediction reward to encourage better prediction of the user's preference profile, which facilitates more reasonable assessments exploration. Extensive experiments demonstrate the superiority of the proposed method.
Related papers
- Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z) - A Framework for Personalized Persuasiveness Prediction via Context-Aware User Profiling [21.531813748944383]
Estimating the persuasiveness of messages is critical in various applications.<n>No established framework to optimize leveraging a persuadee's past activities to the benefit of a persuasiveness prediction model.<n>We propose a context-aware user profiling framework with two trainable components.
arXiv Detail & Related papers (2026-01-09T09:22:31Z) - Personalized Reward Modeling for Text-to-Image Generation [9.780251969338044]
We present PIGReward, a personalized reward model that dynamically generates user-conditioned evaluation dimensions and assesses images through CoT reasoning.<n> PIGReward provides personalized feedback that drives user-specific prompt optimization, improving alignment between generated images and individual intent.<n>Extensive experiments demonstrate that PIGReward surpasses existing methods in both accuracy and interpretability.
arXiv Detail & Related papers (2025-11-21T12:04:24Z) - Personalized Recommendations via Active Utility-based Pairwise Sampling [1.704905100460915]
We propose a utility-based framework that learns preferences from simple and intuitive pairwise comparisons.<n>A central contribution of our work is a novel utility-based active sampling strategy for preference elicitation.
arXiv Detail & Related papers (2025-08-12T19:09:33Z) - Learning User Preferences for Image Generation Model [15.884017849539754]
We propose an approach built upon Multimodal Large Language Models to learn personalized user preferences.<n>The contrastive preference loss is designed to effectively distinguish between user ''likes'' and ''dislikes''<n>The learnable preference tokens capture shared interest representations among existing users, enabling the model to activate group-specific preferences and enhance consistency across similar users.
arXiv Detail & Related papers (2025-08-11T17:39:42Z) - LOTUS: A Leaderboard for Detailed Image Captioning from Quality to Societal Bias and User Preferences [91.13704541413551]
LOTUS is a leaderboard for evaluating detailed captions.<n>It comprehensively evaluates various aspects, including caption quality.<n>It enables preference-oriented evaluations by tailoring criteria to diverse user preferences.
arXiv Detail & Related papers (2025-07-25T15:12:42Z) - NextQuill: Causal Preference Modeling for Enhancing LLM Personalization [82.15961484963256]
We introduce NextQuill, a novel personalization framework grounded in causal preference modeling.<n>Building on this insight, NextQuill introduces two complementary alignment strategies.<n> Experiments across multiple personalization benchmarks demonstrate that NextQuill significantly improves personalization quality.
arXiv Detail & Related papers (2025-06-03T02:08:55Z) - Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences? [16.12440288407791]
A popular trend is adding a prefix to the current user's conversation to steer preference distribution.<n>Most methods passively model personal preferences with prior example preferences pairs.<n>We ask whether models benefit from actively inferring preference descriptions.<n>We then test how effective finetuned 1-8B size models are at inferring and aligning to personal preferences.
arXiv Detail & Related papers (2025-05-19T15:39:48Z) - Interactive Visualization Recommendation with Hier-SUCB [52.11209329270573]
We propose an interactive personalized visualization recommendation (PVisRec) system that learns on user feedback from previous interactions.<n>For more interactive and accurate recommendations, we propose Hier-SUCB, a contextual semi-bandit in the PVisRec setting.
arXiv Detail & Related papers (2025-02-05T17:14:45Z) - Personalized Preference Fine-tuning of Diffusion Models [75.22218338096316]
We introduce PPD, a multi-reward optimization objective that aligns diffusion models with personalized preferences.<n>With PPD, a diffusion model learns the individual preferences of a population of users in a few-shot way.<n>Our approach achieves an average win rate of 76% over Stable Cascade, generating images that more accurately reflect specific user preferences.
arXiv Detail & Related papers (2025-01-11T22:38:41Z) - Scaling Up Personalized Image Aesthetic Assessment via Task Vector Customization [37.66059382315255]
We present a unique approach that leverages readily available databases for general image aesthetic assessment and image quality assessment.
By determining optimal combinations of task vectors, known to represent specific traits of each database, we successfully create personalized models for individuals.
arXiv Detail & Related papers (2024-07-09T18:42:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.