SimRPD: Optimizing Recruitment Proactive Dialogue Agents through Simulator-Based Data Evaluation and Selection
- URL: http://arxiv.org/abs/2601.02871v2
- Date: Thu, 08 Jan 2026 04:14:02 GMT
- Title: SimRPD: Optimizing Recruitment Proactive Dialogue Agents through Simulator-Based Data Evaluation and Selection
- Authors: Zhiyong Cao, Dunqiang Liu, Qi Dai, Haojun Xu, Huaiyan Xu, Huan He, Yafei Liu, Siyuan Liu, XiaoLin Lin, Ke Ma, Ruqian Shi, Sijia Yao, Hao Wang, Sicheng Zhou,
- Abstract summary: SimRPD is a three-stage framework for training recruitment proactive dialogue agents.<n>First, we develop a high-fidelity user simulator to synthesize large-scale conversational data.<n>Then, we introduce a multi-dimensional evaluation framework based on Chain-of-Intention.<n>Finally, we train the recruitment proactive dialogue agent on the selected dataset.
- Score: 19.80033581334685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Task-oriented proactive dialogue agents play a pivotal role in recruitment, particularly for steering conversations towards specific business outcomes, such as acquiring social-media contacts for private-channel conversion. Although supervised fine-tuning and reinforcement learning have proven effective for training such agents, their performance is heavily constrained by the scarcity of high-quality, goal-oriented domain-specific training data. To address this challenge, we propose SimRPD, a three-stage framework for training recruitment proactive dialogue agents. First, we develop a high-fidelity user simulator to synthesize large-scale conversational data through multi-turn online dialogue. Then we introduce a multi-dimensional evaluation framework based on Chain-of-Intention (CoI) to comprehensively assess the simulator and effectively select high-quality data, incorporating both global-level and instance-level metrics. Finally, we train the recruitment proactive dialogue agent on the selected dataset. Experiments in a real-world recruitment scenario demonstrate that SimRPD outperforms existing simulator-based data selection strategies, highlighting its practical value for industrial deployment and its potential applicability to other business-oriented dialogue scenarios.
Related papers
- AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing [79.0112532518727]
We release TeleSalesCorpus, the first real-world-grounded dialogue dataset for this domain.<n>We then propose AI-Salesman, a novel framework featuring a dual-stage architecture.<n>We show that our proposed AI-Salesman significantly outperforms baseline models in both automatic metrics and comprehensive human evaluations.
arXiv Detail & Related papers (2025-11-15T09:44:42Z) - LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback [121.78866929908871]
Large Action Models (LAMs) for AI Agents offer incredible potential but face challenges due to the need for high-quality training data.<n>We present LAM SIMULATOR, a comprehensive framework designed for online exploration of agentic tasks with high-quality feedback.<n>Our framework features a dynamic task query generator, an extensive collection of tools, and an interactive environment where Large Language Model (LLM) Agents can call tools and receive real-time feedback.
arXiv Detail & Related papers (2025-06-02T22:36:02Z) - From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System [49.57258257916805]
Large Language Models (LLMs) demonstrate strong zero-shot recommendation capabilities.<n>Practical applications often favor smaller, internally managed recommender models due to scalability, interpretability, and data privacy constraints.<n>We propose an active data augmentation framework that synthesizes conversational training data by leveraging black-box LLMs guided by active learning techniques.
arXiv Detail & Related papers (2025-04-21T23:05:47Z) - Towards Personalized Conversational Sales Agents: Contextual User Profiling for Strategic Action [12.637812936971049]
We present Conversational Sales (CSALES), a novel task that integrates preference elicitation, recommendation and persuasion within a unified conversational framework.<n>We also propose CSI, a conversational sales agent that proactively infers contextual user profiles and strategically selects actions through conversation.
arXiv Detail & Related papers (2025-03-28T15:49:52Z) - Dynamic benchmarking framework for LLM-based conversational data capture [0.0]
This paper introduces a benchmarking framework to assess large language models (LLMs)<n>It integrates generative agent simulation to evaluate performance on key dimensions: information extraction, context awareness, and adaptive engagement.<n>Results show that adaptive strategies improve data extraction accuracy, especially when handling ambiguous responses.
arXiv Detail & Related papers (2025-02-04T15:47:47Z) - MockLLM: A Multi-Agent Behavior Collaboration Framework for Online Job Seeking and Recruiting [29.676163697160945]
We propose textbfMockLLM, a novel framework to generate and evaluate mock interview interactions.<n>By simulating both interviewer and candidate roles, MockLLM enables consistent and collaborative interactions for real-time and two-sided matching.<n>We evaluate MockLLM on real-world data Boss Zhipin, a major Chinese recruitment platform.
arXiv Detail & Related papers (2024-05-28T12:23:16Z) - Improving Conversational Recommendation Systems via Counterfactual Data
Simulation [73.4526400381668]
Conversational recommender systems (CRSs) aim to provide recommendation services via natural language conversations.
Existing CRS approaches often suffer from the issue of insufficient training due to the scarcity of training data.
We propose a CounterFactual data simulation approach for CRS, named CFCRS, to alleviate the issue of data scarcity in CRSs.
arXiv Detail & Related papers (2023-06-05T12:48:56Z) - Learning towards Selective Data Augmentation for Dialogue Generation [52.540330534137794]
We argue that not all cases are beneficial for augmentation task, and the cases suitable for augmentation should obey the following two attributes.
We propose a Selective Data Augmentation framework (SDA) for the response generation task.
arXiv Detail & Related papers (2023-03-17T01:26:39Z) - Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with
User Simulator [37.590563896382456]
We propose an interactive evaluation framework for Task-Oriented Dialogue (TOD) systems.
We first build a goal-oriented user simulator based on pre-trained models and then use the user simulator to interact with the dialogue system to generate dialogues.
Experimental results show that RL-based TOD systems trained by our proposed user simulator can achieve nearly 98% inform and success rates.
arXiv Detail & Related papers (2022-10-26T07:41:32Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.