Improved Customer Transaction Classification using Semi-Supervised
Knowledge Distillation
- URL: http://arxiv.org/abs/2102.07635v1
- Date: Mon, 15 Feb 2021 16:16:42 GMT
- Title: Improved Customer Transaction Classification using Semi-Supervised
Knowledge Distillation
- Authors: Rohan Sukumaran
- Abstract summary: We propose a cost-effective transaction classification approach based on semi-supervision and knowledge distillation frameworks.
The approach identifies the category of a transaction using free text input given by the customer.
We use weak labelling and notice that the performance gains are similar to that of using human-annotated samples.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In pickup and delivery services, transaction classification based on customer
provided free text is a challenging problem. It involves the association of a
wide variety of customer inputs to a fixed set of categories while adapting to
the various customer writing styles. This categorization is important for the
business: it helps understand the market needs and trends, and also assist in
building a personalized experience for different segments of the customers.
Hence, it is vital to capture these category information trends at scale, with
high precision and recall. In this paper, we focus on a specific use-case where
a single category drives each transaction. We propose a cost-effective
transaction classification approach based on semi-supervision and knowledge
distillation frameworks. The approach identifies the category of a transaction
using free text input given by the customer. We use weak labelling and notice
that the performance gains are similar to that of using human-annotated
samples. On a large internal dataset and on 20Newsgroup dataset, we see that
RoBERTa performs the best for the categorization tasks. Further, using an
ALBERT model (it has 33x fewer parameters vis-a-vis parameters of RoBERTa),
with RoBERTa as the Teacher, we see a performance similar to that of RoBERTa
and better performance over unadapted ALBERT. This framework, with ALBERT as a
student and RoBERTa as teacher, is further referred to as R-ALBERT in this
paper. The model is in production and is used by business to understand
changing trends and take appropriate decisions.
Related papers
- Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models [50.370043676415875]
In smart retail applications, the large number of products and their frequent turnover necessitate reliable zero-shot object classification methods.
We introduce the MIMEX dataset, comprising 28 distinct product categories.
We benchmark the zero-shot object classification performance of state-of-the-art vision-language models (VLMs) on the proposed MIMEX dataset.
arXiv Detail & Related papers (2024-09-23T12:28:40Z) - Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Optimizing Multi-Class Text Classification: A Diverse Stacking Ensemble
Framework Utilizing Transformers [0.0]
This study introduces a stacking ensemble-based multi-text classification method that leverages transformer models.
By combining multiple single transformers, including BERT, ELECTRA, and DistilBERT, an optimal predictive model is generated.
Experimental evaluations conducted on a real-world customer review dataset demonstrate the effectiveness and superiority of the proposed approach.
arXiv Detail & Related papers (2023-08-19T13:29:15Z) - Ranking-based Group Identification via Factorized Attention on Social
Tripartite Graph [68.08590487960475]
We propose a novel GNN-based framework named Contextualized Factorized Attention for Group identification (CFAG)
We devise tripartite graph convolution layers to aggregate information from different types of neighborhoods among users, groups, and items.
To cope with the data sparsity issue, we devise a novel propagation augmentation layer, which is based on our proposed factorized attention mechanism.
arXiv Detail & Related papers (2022-11-02T01:42:20Z) - Association Graph Learning for Multi-Task Classification with Category
Shifts [68.58829338426712]
We focus on multi-task classification, where related classification tasks share the same label space and are learned simultaneously.
We learn an association graph to transfer knowledge among tasks for missing classes.
Our method consistently performs better than representative baselines.
arXiv Detail & Related papers (2022-10-10T12:37:41Z) - A Semi-supervised Multi-task Learning Approach to Classify Customer
Contact Intents [6.267558847860381]
We build text-based intent classification models for a customer support service on an E-commerce website.
We improve the performance significantly by evolving the model from multiclass classification to semi-supervised multi-task learning.
In the evaluation, the final model boosts the average AUC ROC by almost 20 points compared to the baseline finetuned multiclass classification ALBERT model.
arXiv Detail & Related papers (2021-06-10T16:13:05Z) - Multi-class Text Classification using BERT-based Active Learning [4.028503203417233]
Classifying customer transactions into multiple categories helps understand the market needs for different customer segments.
BERT-based models have proven to perform well in Natural Language Understanding.
We benchmark the performance of BERT across different Active Learning strategies in Multi-Class Text Classification.
arXiv Detail & Related papers (2021-04-27T19:49:39Z) - Regularised Text Logistic Regression: Key Word Detection and Sentiment
Classification for Online Reviews [8.036300326665538]
We propose a Regularized Text Logistic regression model to perform text analytics and sentiment classification on unstructured text data.
We apply the RTL model to two online review datasets, Restaurant and Hotel, from TripAdvisor.
arXiv Detail & Related papers (2020-09-09T22:37:53Z) - Students Need More Attention: BERT-based AttentionModel for Small Data
with Application to AutomaticPatient Message Triage [65.7062363323781]
We propose a novel framework based on BioBERT (Bidirectional Representations from Transformers forBiomedical TextMining)
We introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets.
As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent.
arXiv Detail & Related papers (2020-06-22T03:39:00Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.