Related papers: Hierarchical Query Classification in E-commerce Search

Hierarchical Query Classification in E-commerce Search

URL: http://arxiv.org/abs/2403.06021v1
Date: Sat, 9 Mar 2024 21:55:55 GMT
Title: Hierarchical Query Classification in E-commerce Search
Authors: Bing He, Sreyashi Nag, Limeng Cui, Suhang Wang, Zheng Li, Rahul Goutam, Zhen Li, Haiyang Zhang
Abstract summary: E-commerce platforms typically store and structure product information and search data in a hierarchy. Efficiently categorizing user search queries into a similar hierarchical structure is paramount in enhancing user experience on e-commerce platforms as well as news curation and academic research. The inherent complexity of hierarchical query classification is compounded by two primary challenges: (1) the pronounced class imbalance that skews towards dominant categories, and (2) the inherent brevity and ambiguity of search queries that hinder accurate classification.
Score: 38.67034103433015
License: http://creativecommons.org/licenses/by/4.0/
Abstract: E-commerce platforms typically store and structure product information and search data in a hierarchy. Efficiently categorizing user search queries into a similar hierarchical structure is paramount in enhancing user experience on e-commerce platforms as well as news curation and academic research. The significance of this task is amplified when dealing with sensitive query categorization or critical information dissemination, where inaccuracies can lead to considerable negative impacts. The inherent complexity of hierarchical query classification is compounded by two primary challenges: (1) the pronounced class imbalance that skews towards dominant categories, and (2) the inherent brevity and ambiguity of search queries that hinder accurate classification. To address these challenges, we introduce a novel framework that leverages hierarchical information through (i) enhanced representation learning that utilizes the contrastive loss to discern fine-grained instance relationships within the hierarchy, called ''instance hierarchy'', and (ii) a nuanced hierarchical classification loss that attends to the intrinsic label taxonomy, named ''label hierarchy''. Additionally, based on our observation that certain unlabeled queries share typographical similarities with labeled queries, we propose a neighborhood-aware sampling technique to intelligently select these unlabeled queries to boost the classification performance. Extensive experiments demonstrate that our proposed method is better than state-of-the-art (SOTA) on the proprietary Amazon dataset, and comparable to SOTA on the public datasets of Web of Science and RCV1-V2. These results underscore the efficacy of our proposed solution, and pave the path toward the next generation of hierarchy-aware query classification systems.

Related papers

TagRec++: Hierarchical Label Aware Attention Network for Question Categorization [0.3683202928838613]
Online learning systems organize the content according to a well defined taxonomy of hierarchical nature. The task of categorizing inputs to the hierarchical labels is usually cast as a flat multi-class classification problem. We formulate the task as a dense retrieval problem to retrieve the appropriate hierarchical labels for each content.
arXiv Detail & Related papers (2022-08-10T05:08:37Z)
Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework [75.79736930414715]
We present a hierarchical multi-label representation learning framework that can leverage all available labels and preserve the hierarchical relationship between classes. We introduce novel hierarchy preserving losses, which jointly apply a hierarchical penalty to the contrastive loss, and enforce the hierarchy constraint.
arXiv Detail & Related papers (2022-04-27T21:41:44Z)
The Overlooked Classifier in Human-Object Interaction Recognition [82.20671129356037]
We encode the semantic correlation among classes into the classification head by initializing the weights with language embeddings of HOIs. We propose a new loss named LSE-Sign to enhance multi-label learning on a long-tailed dataset. Our simple yet effective method enables detection-free HOI classification, outperforming the state-of-the-arts that require object detection and human pose by a clear margin.
arXiv Detail & Related papers (2022-03-10T23:35:00Z)
Label Hierarchy Transition: Delving into Class Hierarchies to Enhance Deep Classifiers [40.993137740456014]
We propose a unified probabilistic framework based on deep learning to address the challenges of hierarchical classification. The proposed framework can be readily adapted to any existing deep network with only minor modifications. We extend our proposed LHT framework to the skin lesion diagnosis task and validate its great potential in computer-aided diagnosis.
arXiv Detail & Related papers (2021-12-04T14:58:36Z)
Out-of-Category Document Identification Using Target-Category Names as Weak Supervision [64.671654559798]
Out-of-category detection aims to distinguish documents according to their semantic relevance to the inlier (or target) categories. We present an out-of-category detection framework, which effectively measures how confidently each document belongs to one of the target categories.
arXiv Detail & Related papers (2021-11-24T21:01:25Z)
QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction [57.56700153507383]
This paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO. For the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels. For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products.
arXiv Detail & Related papers (2021-08-19T03:24:23Z)
Inducing a hierarchy for multi-class classification problems [11.58041597483471]
In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. In this paper, we investigate a class of methods that induce a hierarchy that can similarly improve classification performance over flat classifiers. We demonstrate the effectiveness of the class of methods both for discovering a latent hierarchy and for improving accuracy in principled simulation settings and three real data applications.
arXiv Detail & Related papers (2021-02-20T05:40:42Z)
Pitfalls of Assessing Extracted Hierarchies for Multi-Class Classification [4.89253144446913]
We identify some common pitfalls that may lead practitioners to make misleading conclusions about their methods. We show how the hierarchy's quality can become irrelevant depending on the experimental setup. Our results confirm that datasets with a high number of classes generally present complex structures in how these classes relate to each other.
arXiv Detail & Related papers (2021-01-26T21:50:57Z)
Joint Learning of Hyperbolic Label Embeddings for Hierarchical Multi-label Classification [9.996804039553858]
We consider the problem of multi-label classification where the labels lie in a hierarchy. We propose a novel formulation for the joint learning and empirically evaluate its efficacy.
arXiv Detail & Related papers (2021-01-13T10:58:54Z)
Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50. Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z)
Generating Categories for Sets of Entities [34.32017697099142]
Category systems are central components of knowledge bases, as they provide a hierarchical grouping of semantically related concepts and entities. This paper presents a method of generating categories for sets of entities using neural abstractive summarization models. We develop a test collection based on Wikipedia categories and demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-08-19T13:31:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.