DeepCAT: Deep Category Representation for Query Understanding in
E-commerce Search
- URL: http://arxiv.org/abs/2104.11760v1
- Date: Fri, 23 Apr 2021 18:04:44 GMT
- Title: DeepCAT: Deep Category Representation for Query Understanding in
E-commerce Search
- Authors: Ali Ahmadvand, Surya Kallumadi, Faizan Javed, and Eugene Agichtein
- Abstract summary: We propose a deep learning model, DeepCAT, which learns joint word-category representations to enhance the query understanding process.
Our results show that DeepCAT reaches a 10% improvement on em minority classes and a 7.1% improvement on em tail queries over a state-of-the-art label embedding model.
- Score: 15.041444067591007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mapping a search query to a set of relevant categories in the product
taxonomy is a significant challenge in e-commerce search for two reasons: 1)
Training data exhibits severe class imbalance problem due to biased click
behavior, and 2) queries with little customer feedback (e.g., \textit{tail}
queries) are not well-represented in the training set, and cause difficulties
for query understanding. To address these problems, we propose a deep learning
model, DeepCAT, which learns joint word-category representations to enhance the
query understanding process. We believe learning category interactions helps to
improve the performance of category mapping on \textit{minority} classes,
\textit{tail} and \textit{torso} queries. DeepCAT contains a novel
word-category representation model that trains the category representations
based on word-category co-occurrences in the training set. The category
representation is then leveraged to introduce a new loss function to estimate
the category-category co-occurrences for refining joint word-category
embeddings. To demonstrate our model's effectiveness on {\em minority}
categories and {\em tail} queries, we conduct two sets of experiments. The
results show that DeepCAT reaches a 10\% improvement on {\em minority} classes
and a 7.1\% improvement on {\em tail} queries over a state-of-the-art label
embedding model. Our findings suggest a promising direction for improving
e-commerce search by semantic modeling of taxonomy hierarchies.
Related papers
- A Semi-supervised Multi-channel Graph Convolutional Network for Query Classification in E-commerce [10.870790183380517]
We propose a novel Semi-supervised Multi-channel Graph Convolutional Network (SMGCN) to address the above problems.
SMGCN extends category information and enhances the posterior label by utilizing the similarity score between the query and categories.
arXiv Detail & Related papers (2024-08-04T04:52:21Z) - Category-Extensible Out-of-Distribution Detection via Hierarchical Context Descriptions [35.20091752343433]
This work introduces two hierarchical contexts, namely perceptual context and spurious context, to carefully describe the precise category boundary.
The two contexts hierarchically construct the precise description for a certain category, which is first roughly classifying a sample to the predicted category.
The precise descriptions for those categories within the vision-language framework present a novel application: CATegory-EXtensible OOD detection (CATEX)
arXiv Detail & Related papers (2024-07-23T12:53:38Z) - Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling.
We generate supplemental training pairs by altering the most important part of a search context.
We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z) - Hierarchical Query Classification in E-commerce Search [38.67034103433015]
E-commerce platforms typically store and structure product information and search data in a hierarchy.
Efficiently categorizing user search queries into a similar hierarchical structure is paramount in enhancing user experience on e-commerce platforms as well as news curation and academic research.
The inherent complexity of hierarchical query classification is compounded by two primary challenges: (1) the pronounced class imbalance that skews towards dominant categories, and (2) the inherent brevity and ambiguity of search queries that hinder accurate classification.
arXiv Detail & Related papers (2024-03-09T21:55:55Z) - List-aware Reranking-Truncation Joint Model for Search and
Retrieval-augmented Generation [80.12531449946655]
We propose a Reranking-Truncation joint model (GenRT) that can perform the two tasks concurrently.
GenRT integrates reranking and truncation via generative paradigm based on encoder-decoder architecture.
Our method achieves SOTA performance on both reranking and truncation tasks for web search and retrieval-augmented LLMs.
arXiv Detail & Related papers (2024-02-05T06:52:53Z) - Investigating the Limitation of CLIP Models: The Worst-Performing
Categories [53.360239882501325]
Contrastive Language-Image Pre-training (CLIP) provides a foundation model by integrating natural language into visual concepts.
It is usually expected that satisfactory overall accuracy can be achieved across numerous domains through well-designed textual prompts.
However, we found that their performance in the worst categories is significantly inferior to the overall performance.
arXiv Detail & Related papers (2023-10-05T05:37:33Z) - Intermediate Prototype Mining Transformer for Few-Shot Semantic
Segmentation [119.51445225693382]
Few-shot semantic segmentation aims to segment the target objects in query under the condition of a few annotated support images.
We introduce an intermediate prototype for mining both deterministic category information from the support and adaptive category knowledge from the query.
In each IPMT layer, we propagate the object information in both support and query features to the prototype and then use it to activate the query feature map.
arXiv Detail & Related papers (2022-10-13T06:45:07Z) - Query-Guided Networks for Few-shot Fine-grained Classification and
Person Search [93.80556485668731]
Few-shot fine-grained classification and person search appear as distinct tasks and literature has treated them separately.
We propose a novel unified Query-Guided Network (QGN) applicable to both tasks.
QGN improves on a few recent few-shot fine-grained datasets, outperforming other techniques on CUB by a large margin.
arXiv Detail & Related papers (2022-09-21T10:25:32Z) - TagRec: Automated Tagging of Questions with Hierarchical Learning
Taxonomy [0.0]
Online educational platforms organize academic questions based on a hierarchical learning taxonomy (subject-chapter-topic)
This paper formulates the problem as a similarity-based retrieval task where we optimize the semantic relatedness between the taxonomy and the questions.
We demonstrate that our method helps to handle the unseen labels and hence can be used for taxonomy tagging in the wild.
arXiv Detail & Related papers (2021-07-03T11:50:55Z) - APRF-Net: Attentive Pseudo-Relevance Feedback Network for Query
Categorization [12.634704014206294]
We propose a novel deep neural model named textbfAttentive textbfPseudo textbfRelevance textbfFeedback textbfNetwork (APRF-Net) to enhance the representation of rare queries for query categorization.
Our results show that the APRF-Net significantly improves query categorization by 5.9% on $F1@1$ score over the baselines, which increases to 8.2% improvement for the rare queries.
arXiv Detail & Related papers (2021-04-23T02:34:08Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.