PAARS: Persona Aligned Agentic Retail Shoppers
- URL: http://arxiv.org/abs/2503.24228v1
- Date: Mon, 31 Mar 2025 15:41:51 GMT
- Title: PAARS: Persona Aligned Agentic Retail Shoppers
- Authors: Saab Mansour, Leonardo Perelli, Lorenzo Mainetti, George Davidson, Stefano D'Amato,
- Abstract summary: In e-commerce, behavioral data is collected for decision making which can be costly and slow.<n>We propose a framework that creates synthetic shopping agents by automatically mining anonymised historical shopping data.<n>We showcase an initial application of our framework for automated agentic A/B testing and compare the findings to human results.
- Score: 2.8737584376365355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In e-commerce, behavioral data is collected for decision making which can be costly and slow. Simulation with LLM powered agents is emerging as a promising alternative for representing human population behavior. However, LLMs are known to exhibit certain biases, such as brand bias, review rating bias and limited representation of certain groups in the population, hence they need to be carefully benchmarked and aligned to user behavior. Ultimately, our goal is to synthesise an agent population and verify that it collectively approximates a real sample of humans. To this end, we propose a framework that: (i) creates synthetic shopping agents by automatically mining personas from anonymised historical shopping data, (ii) equips agents with retail-specific tools to synthesise shopping sessions and (iii) introduces a novel alignment suite measuring distributional differences between humans and shopping agents at the group (i.e. population) level rather than the traditional "individual" level. Experimental results demonstrate that using personas improves performance on the alignment suite, though a gap remains to human behaviour. We showcase an initial application of our framework for automated agentic A/B testing and compare the findings to human results. Finally, we discuss applications, limitations and challenges setting the stage for impactful future work.
Related papers
- Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions [4.234771450043289]
Large language models (LLMs) are increasingly capable of simulating human behavior.
We propose a novel methodology for constructing virtual personas with synthetic user backstories" generated as extended, multi-turn interview transcripts.
Our generated backstories are longer, rich in detail, and consistent in authentically describing a singular individual, compared to previous methods.
arXiv Detail & Related papers (2025-04-16T00:10:34Z) - AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents [28.20409050985182]
A/B testing remains constrained by its dependence on the large-scale and live traffic of human participants.
We present AgentA/B, a novel system that automatically simulate user interaction behaviors with real webpages.
Our findings suggest AgentA/B can emulate human-like behavior patterns.
arXiv Detail & Related papers (2025-04-13T21:10:56Z) - LLM Generated Persona is a Promise with a Catch [18.45442859688198]
Persona-based simulations hold promise for transforming disciplines that rely on population-level feedback.<n>Traditional methods to collect realistic persona data face challenges.<n>They are prohibitively expensive and logistically challenging due to privacy constraints.
arXiv Detail & Related papers (2025-03-18T03:11:27Z) - Generative Agent Simulations of 1,000 People [56.82159813294894]
We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals.
The generative agents replicate participants' responses on the General Social Survey 85% as accurately as participants replicate their own answers.
Our architecture reduces accuracy biases across racial and ideological groups compared to agents given demographic descriptions.
arXiv Detail & Related papers (2024-11-15T11:14:34Z) - Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance [95.03771007780976]
We tackle the challenge of developing proactive agents capable of anticipating and initiating tasks without explicit human instructions.<n>First, we collect real-world human activities to generate proactive task predictions.<n>These predictions are labeled by human annotators as either accepted or rejected.<n>The labeled data is used to train a reward model that simulates human judgment.
arXiv Detail & Related papers (2024-10-16T08:24:09Z) - Chatting Up Attachment: Using LLMs to Predict Adult Bonds [0.0]
We use GPT-4 and Claude 3 Opus to create agents that simulate adults with varying profiles, childhood memories, and attachment styles.
We evaluate our models using a transcript dataset from 9 humans who underwent the same interview protocol, analyzed and labeled by mental health professionals.
Our findings indicate that training the models using only synthetic data achieves performance comparable to training the models on human data.
arXiv Detail & Related papers (2024-08-31T04:29:19Z) - PersonaGym: Evaluating Persona Agents and LLMs [47.75926334294358]
We introduce PersonaGym, the first dynamic evaluation framework for assessing persona agents, and PersonaScore, the first automated human-aligned metric grounded in decision theory.<n>Our evaluation of 6 open and closed-source LLMs, using a benchmark encompassing 200 personas and 10,000 questions, reveals significant opportunities for advancement in persona agent capabilities.
arXiv Detail & Related papers (2024-07-25T22:24:45Z) - Select to Perfect: Imitating desired behavior from large multi-agent data [28.145889065013687]
Desired characteristics for AI agents can be expressed by assigning desirability scores.
We first assess the effect of each individual agent's behavior on the collective desirability score.
We propose the concept of an agent's Exchange Value, which quantifies an individual agent's contribution to the collective desirability score.
arXiv Detail & Related papers (2024-05-06T15:48:24Z) - Unlocking the `Why' of Buying: Introducing a New Dataset and Benchmark for Purchase Reason and Post-Purchase Experience [24.949929747493204]
We propose purchase reason prediction as a novel task for modern AI models.
We first generate a dataset that consists of real-world explanations of why users make certain purchase decisions for various products.
Our approach induces LLMs to explicitly distinguish between the reasons behind purchasing a product and the experience after the purchase in a user review.
arXiv Detail & Related papers (2024-02-20T23:04:06Z) - AgentCF: Collaborative Learning with Autonomous Language Agents for
Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering.
We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together.
Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting
Pot [71.28884625011987]
Melting Pot is a MARL evaluation suite that uses reinforcement learning to reduce the human labor required to create novel test scenarios.
We have created over 80 unique test scenarios covering a broad range of research topics.
We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.
arXiv Detail & Related papers (2021-07-14T17:22:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.