Active Learning for Product Type Ontology Enhancement in E-commerce
- URL: http://arxiv.org/abs/2009.09143v2
- Date: Thu, 11 Mar 2021 22:41:09 GMT
- Title: Active Learning for Product Type Ontology Enhancement in E-commerce
- Authors: Yun Zhu, Sayyed M. Zahiri, Jiaqi Wang, Han-Yu Chen, Faizan Javed
- Abstract summary: We propose an active learning framework that efficiently utilizes domain experts' knowledge for PT discovery.
We also show the quality and coverage of the resulting PTs in the experiment results.
- Score: 16.170442845801183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entity-based semantic search has been widely adopted in modern search engines
to improve search accuracy by understanding users' intent. In e-commerce, an
accurate and complete product type (PT) ontology is essential for recognizing
product entities in queries and retrieving relevant products from catalog.
However, finding product types (PTs) to construct such an ontology is usually
expensive due to the considerable amount of human efforts it may involve. In
this work, we propose an active learning framework that efficiently utilizes
domain experts' knowledge for PT discovery. We also show the quality and
coverage of the resulting PTs in the experiment results.
Related papers
- Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature [48.572336666741194]
We present Knowledge Navigator, a system designed to enhance exploratory search abilities.
It organizes retrieved documents into a navigable, two-level hierarchy of named and descriptive scientific topics and subtopics.
arXiv Detail & Related papers (2024-08-28T14:48:37Z) - STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases.
Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine.
We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z) - Overview of the TREC 2023 Product Product Search Track [70.56592126043546]
This is the first year of the TREC Product search track.
The focus was the creation of a reusable collection.
We leverage the new product search corpus, which includes contextual metadata.
arXiv Detail & Related papers (2023-11-14T02:25:18Z) - DiscoverPath: A Knowledge Refinement and Retrieval System for
Interdisciplinarity on Biomedical Research [96.10765714077208]
Traditional keyword-based search engines fall short in assisting users who may not be familiar with specific terminologies.
We present a knowledge graph-based paper search engine for biomedical research to enhance the user experience.
The system, dubbed DiscoverPath, employs Named Entity Recognition (NER) and part-of-speech (POS) tagging to extract terminologies and relationships from article abstracts to create a KG.
arXiv Detail & Related papers (2023-09-04T20:52:33Z) - Product Information Extraction using ChatGPT [69.12244027050454]
This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions.
Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.
arXiv Detail & Related papers (2023-06-23T09:30:01Z) - Intent-based Product Collections for E-commerce using Pretrained
Language Models [8.847005669899703]
We use a pretrained language model (PLM) that leverages textual attributes of web-scale products to make intent-based product collections.
Our model significantly outperforms the search-based baseline model for intent-based product matching in offline evaluations.
Online experimental results on our e-commerce platform show that the PLM-based method can construct collections of products with increased CTR, CVR, and order-diversity compared to expert-crafted collections.
arXiv Detail & Related papers (2021-10-15T17:52:42Z) - Query2Prod2Vec Grounded Word Embeddings for eCommerce [4.137464623395377]
We present a model that grounds lexical representations for product search in product embeddings.
We leverage shopping sessions to learn the underlying space and use merchandising annotations to build lexical analogies for evaluation.
arXiv Detail & Related papers (2021-04-02T21:32:43Z) - Theoretical Understandings of Product Embedding for E-commerce Machine
Learning [18.204325860752768]
We take an e-commerce-oriented view of the product embeddings and reveal a complete theoretical view from both the representation learning and the learning theory perspective.
We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm are sufficient dimension reduction regarding a critical product relatedness measure.
The generalization performance in the downstream machine learning task is controlled by the alignment between the embeddings and the product relatedness measure.
arXiv Detail & Related papers (2021-02-24T02:29:15Z) - Exploiting Knowledge Graphs for Facilitating Product/Service Discovery [1.2691047660244332]
This work presents a cost-effective solution for e-commerce on the Data Web by employing an unsupervised approach for data classification.
The proposed architecture describes available products in web language OWL and stores them in a triple store.
User input specifications for certain products are matched against the available product categories to generate a knowledge graph.
arXiv Detail & Related papers (2020-10-11T10:22:10Z) - E-BERT: A Phrase and Product Knowledge Enhanced Language Model for
E-commerce [63.333860695727424]
E-commerce tasks require accurate understanding of domain phrases, whereas such fine-grained phrase-level knowledge is not explicitly modeled by BERT's training objective.
To tackle the problem, we propose a unified pre-training framework, namely, E-BERT.
Specifically, to preserve phrase-level knowledge, we introduce Adaptive Hybrid Masking, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases.
To utilize product-level knowledge, we introduce Neighbor Product Reconstruction, which trains E-BERT to predict a product's associated neighbors with a denoising cross attention layer
arXiv Detail & Related papers (2020-09-07T00:15:36Z) - Modeling Product Search Relevance in e-Commerce [7.139647051098728]
We propose a robust way of predicting relevance scores given a search query and a product.
We compare conventional information retrieval models such as BM25 and Indri with deep learning models such as word2vec, sentence2vec and paragraph2vec.
arXiv Detail & Related papers (2020-01-14T21:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.