Pathways of Thoughts: Multi-Directional Thinking for Long-form Personalized Question Answering
- URL: http://arxiv.org/abs/2509.19094v1
- Date: Tue, 23 Sep 2025 14:44:46 GMT
- Title: Pathways of Thoughts: Multi-Directional Thinking for Long-form Personalized Question Answering
- Authors: Alireza Salemi, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Zhuowan Li, Spurthi Amba Hombaiah, Weize Kong, Tao Chen, Hamed Zamani, Michael Bendersky,
- Abstract summary: Personalization is essential for adapting question answering systems to user-specific information needs.<n>We propose Pathways of Thoughts (PoT), an inference-stage method that applies to any large language model (LLM) without requiring task-specific fine-tuning.<n>PoT consistently outperforms competitive baselines, achieving up to a 13.1% relative improvement.
- Score: 57.12316804290369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personalization is essential for adapting question answering (QA) systems to user-specific information needs, thereby improving both accuracy and user satisfaction. However, personalized QA remains relatively underexplored due to challenges such as inferring preferences from long, noisy, and implicit contexts, and generating responses that are simultaneously correct, contextually appropriate, and aligned with user expectations and background knowledge. To address these challenges, we propose Pathways of Thoughts (PoT), an inference-stage method that applies to any large language model (LLM) without requiring task-specific fine-tuning. The approach models the reasoning of an LLM as an iterative decision process, where the model dynamically selects among cognitive operations such as reasoning, revision, personalization, and clarification. This enables exploration of multiple reasoning trajectories, producing diverse candidate responses that capture different perspectives. PoT then aggregates and reweights these candidates according to inferred user preferences, yielding a final personalized response that benefits from the complementary strengths of diverse reasoning paths. Experiments on the LaMP-QA benchmark for personalized QA show that PoT consistently outperforms competitive baselines, achieving up to a 13.1% relative improvement. Human evaluation corroborates these results, with annotators preferring outputs from PoT in 66% of cases and reporting ties in only 15% of cases.
Related papers
- Learning to Reason for Multi-Step Retrieval of Personal Context in Personalized Question Answering [39.08300602619814]
Personalization in Question Answering (QA) requires answers that are both accurate and aligned with users' background, preferences, and historical context.<n>We propose PR2, a reinforcement learning framework that integrates reasoning and retrieval from personal context for personalization.
arXiv Detail & Related papers (2026-02-22T19:43:43Z) - Personalized Reasoning: Just-In-Time Personalization and Why LLMs Fail At It [81.50711040539566]
Current large language model (LLM) development treats task-solving and preference alignment as separate challenges.<n>We introduce PREFDISCO, an evaluation methodology that transforms static benchmarks into interactive personalization tasks.<n>Our framework creates scenarios where identical questions require different reasoning chains depending on user context.
arXiv Detail & Related papers (2025-09-30T18:55:28Z) - LaMP-QA: A Benchmark for Personalized Long-form Question Answering [37.909483957959715]
We introduce LaMP-QA -- a benchmark designed for evaluating personalized long-form answer generation.<n>The benchmark covers questions from three major categories: (1) Arts & Entertainment, (2) Lifestyle & Personal Development, and (3) Society & Culture, encompassing over 45 subcategories in total.<n>Our results show that incorporating the personalized context provided leads to up to 39% performance improvements.
arXiv Detail & Related papers (2025-05-30T18:16:03Z) - Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique [66.94905631175209]
We propose a novel inference-time scaling approach -- stepwise natural language self-critique (PANEL)<n>It employs self-generated natural language critiques as feedback to guide the step-level search process.<n>This approach bypasses the need for task-specific verifiers and the associated training overhead.
arXiv Detail & Related papers (2025-03-21T17:59:55Z) - From Guessing to Asking: An Approach to Resolving the Persona Knowledge Gap in LLMs during Multi-Turn Conversations [11.958380211411386]
This study introduces the persona knowledge gap, the discrepancy between a model's internal understanding and the knowledge required for coherent, personalized conversations.<n>We propose Conversation Preference Elicitation and Recommendation (CPER), a novel framework that dynamically detects and resolves persona knowledge gaps.<n>CPER consists of three key modules: a Contextual Understanding Module for preference extraction, a Dynamic Feedback Module for measuring uncertainty and refining persona alignment, and a Persona-Driven Response Generation module for adapting responses based on accumulated user context.
arXiv Detail & Related papers (2025-03-16T15:55:29Z) - Query Performance Prediction using Relevance Judgments Generated by Large Language Models [53.97064615557883]
We propose a new Query performance prediction (QPP) framework using automatically generated relevance judgments (QPP-GenRE)<n>QPP-GenRE decomposes QPP into independent subtasks of predicting relevance of each item in a ranked list to a given query.<n>We predict an item's relevance by using open-source large language models (LLMs) to ensure scientific relevance.
arXiv Detail & Related papers (2024-04-01T09:33:05Z) - Representing and Reasoning with Multi-Stakeholder Qualitative Preference
Queries [9.768677073327423]
We offer the first formal treatment of reasoning with multi-stakeholder qualitative preferences.
We introduce a query for expressing queries against such preferences over sets of outcomes that satisfy specified criteria.
We present experimental results that demonstrate the feasibility of our approach.
arXiv Detail & Related papers (2023-07-30T19:52:59Z) - Improving Selective Visual Question Answering by Learning from Your
Peers [74.20167944693424]
Visual Question Answering (VQA) models can have difficulties abstaining from answering when they are wrong.
We propose Learning from Your Peers (LYP) approach for training multimodal selection functions for making abstention decisions.
Our approach uses predictions from models trained on distinct subsets of the training data as targets for optimizing a Selective VQA model.
arXiv Detail & Related papers (2023-06-14T21:22:01Z) - Generative Context Pair Selection for Multi-hop Question Answering [60.74354009152721]
We propose a generative context selection model for multi-hop question answering.
Our proposed generative passage selection model has a better performance (4.9% higher than baseline) on adversarial held-out set.
arXiv Detail & Related papers (2021-04-18T07:00:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.