AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of
Types
- URL: http://arxiv.org/abs/2006.13473v1
- Date: Wed, 24 Jun 2020 04:35:17 GMT
- Title: AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of
Types
- Authors: Xin Luna Dong, Xiang He, Andrey Kan, Xian Li, Yan Liang, Jun Ma, Yifan
Ethan Xu, Chenwei Zhang, Tong Zhao, Gabriel Blanco Saldana, Saurabh
Deshpande, Alexandre Michetti Manduca, Jay Ren, Surender Pal Singh, Fan Xiao,
Haw-Shiuan Chang, Giannis Karamanolakis, Yuning Mao, Yaqing Wang, Christos
Faloutsos, Andrew McCallum, Jiawei Han
- Abstract summary: We describe AutoKnow, our automatic (self-driving) system that addresses organizing information about products.
The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery.
AutoKnow has been operational in collecting product knowledge for over 11K product types.
- Score: 127.99385433323859
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Can one build a knowledge graph (KG) for all products in the world? Knowledge
graphs have firmly established themselves as valuable sources of information
for search and question answering, and it is natural to wonder if a KG can
contain information about products offered at online retail sites. There have
been several successful examples of generic KGs, but organizing information
about products poses many additional challenges, including sparsity and noise
of structured data for products, complexity of the domain with millions of
product types and thousands of attributes, heterogeneity across large number of
categories, as well as large and constantly growing number of products. We
describe AutoKnow, our automatic (self-driving) system that addresses these
challenges. The system includes a suite of novel techniques for taxonomy
construction, product property identification, knowledge extraction, anomaly
detection, and synonym discovery. AutoKnow is (a) automatic, requiring little
human intervention, (b) multi-scalable, scalable in multiple dimensions (many
domains, many products, and many attributes), and (c) integrative, exploiting
rich customer behavior logs. AutoKnow has been operational in collecting
product knowledge for over 11K product types.
Related papers
- Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Automated Extraction of Fine-Grained Standardized Product Information
from Unstructured Multilingual Web Data [66.21317300595483]
We show how recent advances in machine learning, combined with a recently published multilingual data set, enable robust product attribute extraction.
Our models can reliably predict product attributes across online shops, languages, or both.
arXiv Detail & Related papers (2023-02-23T16:26:11Z) - Learning by Asking Questions for Knowledge-based Novel Object
Recognition [64.55573343404572]
In real-world object recognition, there are numerous object classes to be recognized. Conventional image recognition based on supervised learning can only recognize object classes that exist in the training data, and thus has limited applicability in the real world.
Inspired by this, we study a framework for acquiring external knowledge through question generation that would help the model instantly recognize novel objects.
Our pipeline consists of two components: the Object-based object recognition, and the Question Generator, which generates knowledge-aware questions to acquire novel knowledge.
arXiv Detail & Related papers (2022-10-12T02:51:58Z) - A Survey on Knowledge Graph-based Methods for Automated Driving [0.0]
Knowledge graphs (KG) have gained significant attention from both industry and academia for applications that benefit by exploiting structured, dynamic, and relational data.
We discuss current research challenges and propose promising future research directions for KG-based solutions for automated driving.
arXiv Detail & Related papers (2022-09-30T15:47:19Z) - eProduct: A Million-Scale Visual Search Benchmark to Address Product
Recognition Challenges [8.204924070199866]
eProduct is a benchmark dataset for training and evaluation on various visual search solutions in a real-world setting.
We present eProduct as a training set and an evaluation set, where the training set contains 1.3M+ listing images with titles and hierarchical category labels, for model development.
We will present eProduct's construction steps, provide analysis about its diversity and cover the performance of baseline models trained on it.
arXiv Detail & Related papers (2021-07-13T05:28:34Z) - Billion-scale Pre-trained E-commerce Product Knowledge Graph Model [13.74839302948699]
Pre-trained Knowledge Graph Model (PKGM) for e-commerce product knowledge graph.
PKGM provides item knowledge services in a uniform way for embedding-based models without accessing triple data in the knowledge graph.
We test PKGM in three knowledge-related tasks including item classification, same item identification, and recommendation.
arXiv Detail & Related papers (2021-05-02T04:28:22Z) - Exploiting Knowledge Graphs for Facilitating Product/Service Discovery [1.2691047660244332]
This work presents a cost-effective solution for e-commerce on the Data Web by employing an unsupervised approach for data classification.
The proposed architecture describes available products in web language OWL and stores them in a triple store.
User input specifications for certain products are matched against the available product categories to generate a knowledge graph.
arXiv Detail & Related papers (2020-10-11T10:22:10Z) - Products-10K: A Large-scale Product Recognition Dataset [18.506656670737407]
In this paper, we construct a human-labeled product image dataset named "Products-10K"
The dataset contains 10,000 fine-grained SKU-level products frequently bought by online customers in JD.com.
Based on our new database, we also introduced several useful tips and tricks for fine-grained product recognition.
arXiv Detail & Related papers (2020-08-24T16:33:37Z) - Automatic Validation of Textual Attribute Values in E-commerce Catalog
by Learning with Limited Labeled Data [61.789797281676606]
We propose a novel meta-learning latent variable approach, called MetaBridge.
It can learn transferable knowledge from a subset of categories with limited labeled data.
It can capture the uncertainty of never-seen categories with unlabeled data.
arXiv Detail & Related papers (2020-06-15T21:31:05Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.