Related papers: Enhancing Customer Service Chatbots with Context-Aware NLU through Selective Attention and Multi-task Learning

Enhancing Customer Service Chatbots with Context-Aware NLU through Selective Attention and Multi-task Learning

URL: http://arxiv.org/abs/2506.01781v1
Date: Mon, 02 Jun 2025 15:24:28 GMT
Title: Enhancing Customer Service Chatbots with Context-Aware NLU through Selective Attention and Multi-task Learning
Authors: Subhadip Nandi, Neeraj Agrawal, Anshika Singh, Priyanka Bhatt,
Abstract summary: We introduce a context-aware NLU model for predicting customer intent.<n>A novel selective attention module is used to extract relevant context features.<n>We have deployed our model to production for Walmart's customer care domain.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Customer service chatbots are conversational systems aimed at addressing customer queries, often by directing them to automated workflows. A crucial aspect of this process is the classification of the customer's intent. Presently, most intent classification models for customer care utilise only customer query for intent prediction. This may result in low-accuracy models, which cannot handle ambiguous queries. An ambiguous query like "I didn't receive my package" could indicate a delayed order, or an order that was delivered but the customer failed to receive it. Resolution of each of these scenarios requires the execution of very different sequence of steps. Utilizing additional information, such as the customer's order delivery status, in the right manner can help identify the intent for such ambiguous queries. In this paper, we have introduced a context-aware NLU model that incorporates both, the customer query and contextual information from the customer's order status for predicting customer intent. A novel selective attention module is used to extract relevant context features. We have also proposed a multi-task learning paradigm for the effective utilization of different label types available in our training data. Our suggested method, Multi-Task Learning Contextual NLU with Selective Attention Weighted Context (MTL-CNLU-SAWC), yields a 4.8% increase in top 2 accuracy score over the baseline model which only uses user queries, and a 3.5% improvement over existing state-of-the-art models that combine query and context. We have deployed our model to production for Walmart's customer care domain. Accurate intent prediction through MTL-CNLU-SAWC helps to better direct customers to automated workflows, thereby significantly reducing escalations to human agents, leading to almost a million dollars in yearly savings for the company.

Related papers

SessionIntentBench: A Multi-task Inter-session Intention-shift Modeling Benchmark for E-commerce Customer Behavior Understanding [64.45047674586671]
We introduce the concept of an intention tree and propose a dataset curation pipeline.<n>We construct a sibling multimodal benchmark, SessionIntentBench, that evaluates L(V)LMs' capability on understanding inter-session intention shift.<n>With 1,952,177 intention entries, 1,132,145 session intention trajectories, and 13,003,664 available tasks mined using 10,905 sessions, we provide a scalable way to exploit the existing session data.
arXiv Detail & Related papers (2025-07-27T09:04:17Z)
You Are What You Bought: Generating Customer Personas for E-commerce Applications [22.012818753574905]
This paper introduces the concept of the customer persona.<n>A customer persona provides a multi-faceted and human-readable characterization of specific purchase behaviors and preferences.<n>We evaluate the performance of our persona-based representation in terms of accuracy and robustness for recommendation and customer segmentation tasks.<n>Most notably, we find that integrating customer persona representations improves the state-of-the-art graph-based recommendation model by up to 12% in terms of NDCG@K and F1-Score@K.
arXiv Detail & Related papers (2025-04-24T06:59:16Z)
Robust Uplift Modeling with Large-Scale Contexts for Real-time Marketing [6.511772664252086]
Uplift modeling is proposed to solve the problem, which applies different treatments (e.g., discounts, bonus) to satisfy corresponding users.<n>In real-world scenarios, there are rich contexts available in the online platform (e.g., short videos, news) and the uplift model needs to infer an incentive for each user.<n>We propose a novel model-agnostic Robust Uplift Modeling with Large-Scale Contexts (UMLC) framework for Real-time Marketing.
arXiv Detail & Related papers (2025-01-04T08:55:50Z)
MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification [5.666070277424383]
MAG-V is a framework to generate a dataset of questions that mimic customer queries.<n>Our synthetic data can improve agent performance on actual customer queries.
arXiv Detail & Related papers (2024-11-28T19:36:11Z)
Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books. Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z)
Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications. The quality of these exemplars in the prompt greatly impacts performance. Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z)
Cache & Distil: Optimising API Calls to Large Language Models [82.32065572907125]
Large-scale deployment of generative AI tools often depends on costly API calls to a Large Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can employ a smaller language model -- a student. This student gradually gains proficiency in independently handling an increasing number of user requests.
arXiv Detail & Related papers (2023-10-20T15:01:55Z)
Intent Detection at Scale: Tuning a Generic Model using Relevant Intents [0.5461938536945723]
This work proposes a system to scale intent predictions to various clients effectively, by combining a single generic model with a per-client list of relevant intents. Our approach minimizes training and maintenance costs while providing a personalized experience for clients, allowing for seamless adaptation to changes in their relevant intents. The final system exhibits significantly superior performance compared to industry-specific models, showcasing its flexibility and ability to cater to diverse client needs.
arXiv Detail & Related papers (2023-09-15T13:15:20Z)
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks. We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate. We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z)
Federated Multi-Target Domain Adaptation [99.93375364579484]
Federated learning methods enable us to train machine learning models on distributed user data while preserving its privacy. We consider a more practical scenario where the distributed client data is unlabeled, and a centralized labeled dataset is available on the server. We propose an effective DualAdapt method to address the new challenges.
arXiv Detail & Related papers (2021-08-17T17:53:05Z)
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents [6.267558847860381]
We build text-based intent classification models for a customer support service on an E-commerce website. We improve the performance significantly by evolving the model from multiclass classification to semi-supervised multi-task learning. In the evaluation, the final model boosts the average AUC ROC by almost 20 points compared to the baseline finetuned multiclass classification ALBERT model.
arXiv Detail & Related papers (2021-06-10T16:13:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.