Related papers: Evaluating Contrastive Feedback for Effective User Simulations

Evaluating Contrastive Feedback for Effective User Simulations

URL: http://arxiv.org/abs/2505.02560v2
Date: Tue, 06 May 2025 11:44:32 GMT
Title: Evaluating Contrastive Feedback for Effective User Simulations
Authors: Andreas Konstantin Kruff, Timo Breuer, Philipp Schaer,
Abstract summary: This study explores whether the underlying principles of contrastive training techniques can be applied beneficially in the area of prompt engineering for user simulations.<n>The primary objective of this study is to analyze how different modalities of contextual information influence the effectiveness of user simulations.
Score: 2.8089969618577997
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The use of Large Language Models (LLMs) for simulating user behavior in the domain of Interactive Information Retrieval has recently gained significant popularity. However, their application and capabilities remain highly debated and understudied. This study explores whether the underlying principles of contrastive training techniques, which have been effective for fine-tuning LLMs, can also be applied beneficially in the area of prompt engineering for user simulations. Previous research has shown that LLMs possess comprehensive world knowledge, which can be leveraged to provide accurate estimates of relevant documents. This study attempts to simulate a knowledge state by enhancing the model with additional implicit contextual information gained during the simulation. This approach enables the model to refine the scope of desired documents further. The primary objective of this study is to analyze how different modalities of contextual information influence the effectiveness of user simulations. Various user configurations were tested, where models are provided with summaries of already judged relevant, irrelevant, or both types of documents in a contrastive manner. The focus of this study is the assessment of the impact of the prompting techniques on the simulated user agent performance. We hereby lay the foundations for leveraging LLMs as part of more realistic simulated users.

Related papers

Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z)
Dynamic benchmarking framework for LLM-based conversational data capture [0.0]
This paper introduces a benchmarking framework to assess large language models (LLMs)<n>It integrates generative agent simulation to evaluate performance on key dimensions: information extraction, context awareness, and adaptive engagement.<n>Results show that adaptive strategies improve data extraction accuracy, especially when handling ambiguous responses.
arXiv Detail & Related papers (2025-02-04T15:47:47Z)
LLM-Powered User Simulator for Recommender System [29.328839982869923]
We introduce an LLM-powered user simulator to simulate user engagement with items in an explicit manner.<n>Specifically, we identify the explicit logic of user preferences, leverage LLMs to analyze item characteristics and distill user sentiments.<n>We propose an ensemble model that synergizes logical and statistical insights for user interaction simulations.
arXiv Detail & Related papers (2024-12-22T12:00:04Z)
LLM-assisted Explicit and Implicit Multi-interest Learning Framework for Sequential Recommendation [50.98046887582194]
We propose an explicit and implicit multi-interest learning framework to model user interests on two levels: behavior and semantics. The proposed EIMF framework effectively and efficiently combines small models with LLM to improve the accuracy of multi-interest modeling.
arXiv Detail & Related papers (2024-11-14T13:00:23Z)
Towards a Formal Characterization of User Simulation Objectives in Conversational Information Access [15.54070473873364]
User simulation is a promising approach for automatically training and evaluating conversational information access agents. We define the distinct objectives for user simulators: training aims to maximize behavioral similarity to real users, while evaluation focuses on the accurate prediction of real-world conversational agent performance.
arXiv Detail & Related papers (2024-06-27T08:46:41Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
LLM In-Context Recall is Prompt Dependent [0.0]
A model's ability to do this significantly influences its practical efficacy and dependability in real-world applications. This study demonstrates that an LLM's recall capability is not only contingent upon the prompt's content but also may be compromised by biases in its training data.
arXiv Detail & Related papers (2024-04-13T01:13:59Z)
BASES: Large-scale Web Search User Simulation with Large Language Model based Agents [108.97507653131917]
BASES is a novel user simulation framework with large language models (LLMs) Our simulation framework can generate unique user profiles at scale, which subsequently leads to diverse search behaviors. WARRIORS is a new large-scale dataset encompassing web search user behaviors, including both Chinese and English versions.
arXiv Detail & Related papers (2024-02-27T13:44:09Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors. Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z)
A Case Study on Designing Evaluations of ML Explanations with Simulated User Studies [6.2511886555343805]
We conduct the first SimEvals on a real-world use case to evaluate whether explanations can better support ML-assisted decision-making in e-commerce fraud detection. We find that SimEvals suggest that all considered explainers are equally performant, and none beat a baseline without explanations.
arXiv Detail & Related papers (2023-02-15T03:27:55Z)
Synthetic Data-Based Simulators for Recommender Systems: A Survey [55.60116686945561]
This survey aims at providing a comprehensive overview of the recent trends in the field of modeling and simulation. We start with the motivation behind the development of frameworks implementing the simulations -- simulators. We provide a new consistent classification of existing simulators based on their functionality, approbation, and industrial effectiveness.
arXiv Detail & Related papers (2022-06-22T19:33:21Z)
A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world. We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms. Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.