Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules
- URL: http://arxiv.org/abs/2509.23836v1
- Date: Sun, 28 Sep 2025 12:19:27 GMT
- Title: Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules
- Authors: Chenyu Zhou, Xiaoming Shi, Hui Qiu, Xiawu Zheng, Haitao Leng, Yankai Jiang, Shaoguo Liu, Tingting Gao, Rongrong Ji,
- Abstract summary: This work first introduces a novel corpus, called Mix-ECom, which is constructed based on real-world customer-service dialogues with post-processing to remove user privacy and add CoT process.<n>Specifically, Mix-ECom contains 4,799 samples with multiply dialogue types in each e-commerce dialogue, covering four dialogue types (QA, recommendation, task-oriented dialogue, and chit-chat), three e-commerce task types (pre-sales, logistics, after-sales), and 82 e-commerce rules.<n>Results show that current e-commerce agents lack sufficient capabilities to handle e-commerce dialogues,
- Score: 68.95862698403302
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: E-commerce agents contribute greatly to helping users complete their e-commerce needs. To promote further research and application of e-commerce agents, benchmarking frameworks are introduced for evaluating LLM agents in the e-commerce domain. Despite the progress, current benchmarks lack evaluating agents' capability to handle mixed-type e-commerce dialogue and complex domain rules. To address the issue, this work first introduces a novel corpus, termed Mix-ECom, which is constructed based on real-world customer-service dialogues with post-processing to remove user privacy and add CoT process. Specifically, Mix-ECom contains 4,799 samples with multiply dialogue types in each e-commerce dialogue, covering four dialogue types (QA, recommendation, task-oriented dialogue, and chit-chat), three e-commerce task types (pre-sales, logistics, after-sales), and 82 e-commerce rules. Furthermore, this work build baselines on Mix-Ecom and propose a dynamic framework to further improve the performance. Results show that current e-commerce agents lack sufficient capabilities to handle e-commerce dialogues, due to the hallucination cased by complex domain rules. The dataset will be publicly available.
Related papers
- OneMall: One Architecture, More Scenarios -- End-to-End Generative Recommender Family at Kuaishou E-Commerce [68.7552227901176]
OneMall is an end-to-end generative recommendation framework tailored for e-commerce services at Kuaishou.<n>It unifies the e-commerce's multiple item distribution scenarios, such as Product-card, short-video and live-streaming.<n>OneMall has been deployed, serving over 400 million daily active users at Kuaishou.
arXiv Detail & Related papers (2026-01-29T14:22:39Z) - ECom-Bench: Can LLM Agent Resolve Real-World E-commerce Customer Support Issues? [20.83383124467603]
We introduce ECom-Bench, the first benchmark framework for evaluating LLM agents with multimodal capabilities in the e-commerce customer support domain.<n>ECom-Bench features dynamic user simulation based on persona information collected from real e-commerce customer interactions and a realistic task dataset derived from authentic e-commerce dialogues.
arXiv Detail & Related papers (2025-07-08T03:35:48Z) - EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association [83.4879773429742]
This paper defines the task of E-commerce Script Planning (EcomScript) as three sequential subtasks.<n>We propose a novel framework that enables the scalable generation of product-enriched scripts by associating products with each step.<n>We construct the very first large-scale EcomScript dataset, EcomScriptBench, which includes 605,229 scripts sourced from 2.4 million products.
arXiv Detail & Related papers (2025-05-21T07:21:38Z) - ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models [15.940958043509463]
We propose textbfChineseEcomQA, a scalable question-answering benchmark focused on fundamental e-commerce concepts.<n> Fundamental concepts are designed to be applicable across a diverse array of e-commerce tasks.<n>By carefully balancing generality and specificity, ChineseEcomQA effectively differentiates between broad e-commerce concepts.
arXiv Detail & Related papers (2025-02-27T15:36:00Z) - Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue [80.51690477289418]
Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge.
Large language models (LLMs) generate responses that mimic pre-sales dialogues after fine-tuning, but lack domain-specific knowledge for accurate recommendations.
This paper investigates the effectiveness of combining LLM and CRS in E-commerce pre-sales dialogues.
arXiv Detail & Related papers (2023-10-23T07:00:51Z) - LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following [16.800545001782037]
This paper proposes LLaMA-E, the unified e-commerce authoring models that address the contextual preferences of customers, sellers, and platforms.
We design the instruction set derived from tasks of ads generation, query-enhanced product title rewriting, product classification, purchase intent speculation, and general e-commerce Q&A.
The proposed LLaMA-E models achieve state-of-the-art evaluation performance and exhibit the advantage in zero-shot practical applications.
arXiv Detail & Related papers (2023-08-09T12:26:37Z) - U-NEED: A Fine-grained Dataset for User Needs-Centric E-commerce
Conversational Recommendation [59.81301478480005]
We construct a user needs-centric E-commerce conversational recommendation dataset (U-NEED) from real-world E-commerce scenarios.
U-NEED consists of 3 types of resources: (i) 7,698 fine-grained annotated pre-sales dialogues in 5 top categories (ii) 333,879 user behaviors and (iii) 332,148 product knowledges.
arXiv Detail & Related papers (2023-05-05T01:44:35Z) - Automatic Controllable Product Copywriting for E-Commerce [58.97059802658354]
We deploy an E-commerce Prefix-based Controllable Copywriting Generation into the JD.com e-commerce recommendation platform.
We conduct experiments to validate the effectiveness of the proposed EPCCG.
We introduce the deployed architecture which cooperates with the EPCCG into the real-time JD.com e-commerce recommendation platform.
arXiv Detail & Related papers (2022-06-21T04:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.