Related papers: Theoretical Understandings of Product Embedding for E-commerce Machine Learning

Theoretical Understandings of Product Embedding for E-commerce Machine Learning

URL: http://arxiv.org/abs/2102.12029v1
Date: Wed, 24 Feb 2021 02:29:15 GMT
Title: Theoretical Understandings of Product Embedding for E-commerce Machine Learning
Authors: Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, Kannan Achan
Abstract summary: We take an e-commerce-oriented view of the product embeddings and reveal a complete theoretical view from both the representation learning and the learning theory perspective. We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm are sufficient dimension reduction regarding a critical product relatedness measure. The generalization performance in the downstream machine learning task is controlled by the alignment between the embeddings and the product relatedness measure.
Score: 18.204325860752768
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Product embeddings have been heavily investigated in the past few years, serving as the cornerstone for a broad range of machine learning applications in e-commerce. Despite the empirical success of product embeddings, little is known on how and why they work from the theoretical standpoint. Analogous results from the natural language processing (NLP) often rely on domain-specific properties that are not transferable to the e-commerce setting, and the downstream tasks often focus on different aspects of the embeddings. We take an e-commerce-oriented view of the product embeddings and reveal a complete theoretical view from both the representation learning and the learning theory perspective. We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm and its variants are sufficient dimension reduction regarding a critical product relatedness measure. The generalization performance in the downstream machine learning task is controlled by the alignment between the embeddings and the product relatedness measure. Following the theoretical discoveries, we conduct exploratory experiments that supports our theoretical insights for the product embeddings.

Related papers

Hadamard product in deep learning: Introduction, Advances and Challenges [68.26011575333268]
This survey examines a fundamental yet understudied primitive: the Hadamard product. Despite its widespread implementation across various applications, the Hadamard product has not been systematically analyzed as a core architectural primitive. We present the first comprehensive taxonomy of its applications in deep learning, identifying four principal domains: higher-order correlation, multimodal data fusion, dynamic representation modulation, and efficient pairwise operations.
arXiv Detail & Related papers (2025-04-17T17:26:29Z)
A Catalog of Fairness-Aware Practices in Machine Learning Engineering [13.012624574172863]
Machine learning's widespread adoption in decision-making processes raises concerns about fairness. There remains a gap in understanding and categorizing practices for engineering fairness throughout the machine learning lifecycle. This paper presents a novel catalog of practices for addressing fairness in machine learning derived from a systematic mapping study.
arXiv Detail & Related papers (2024-08-29T16:28:43Z)
Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence. Recent trends demonstrate the potential homogeneity of these two fields. We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z)
Towards Automated Knowledge Integration From Human-Interpretable Representations [55.2480439325792]
We introduce and motivate theoretically the principles of informed meta-learning enabling automated and controllable inductive bias selection. We empirically demonstrate the potential benefits and limitations of informed meta-learning in improving data efficiency and generalisation.
arXiv Detail & Related papers (2024-02-25T15:08:37Z)
MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs) We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles. Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)
Knowledge Graph Completion Models are Few-shot Learners: An Empirical Study of Relation Labeling in E-commerce with LLMs [16.700089674927348]
Large Language Models (LLMs) have shown surprising results in numerous natural language processing tasks. This paper investigates their powerful learning capabilities in natural language and effectiveness in predicting relations between product types with limited labeled data. Our results show that LLMs significantly outperform existing KG completion models in relation labeling for e-commerce KGs and exhibit performance strong enough to replace human labeling.
arXiv Detail & Related papers (2023-05-17T00:08:36Z)
Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why? [18.192733659176806]
We investigate the use of pretrained embeddings in e-commerce applications. We find that there is a lack of a thorough understanding of how pre-trained embeddings work. We establish a principled perspective of pre-trained embeddings via the lens of kernel analysis.
arXiv Detail & Related papers (2023-04-09T23:55:47Z)
Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior: From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks. We present a generalization bound for meta-learning, which was first derived by Rothfuss et al. We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z)
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning [55.048010996144036]
We show that under some noise assumption, we can obtain the linear spectral feature of its corresponding Markov transition operator in closed-form for free. We propose Spectral Dynamics Embedding (SPEDE), which breaks the trade-off and completes optimistic exploration for representation learning by exploiting the structure of the noise.
arXiv Detail & Related papers (2021-11-22T19:24:57Z)
Panoramic Learning with A Standardized Machine Learning Formalism [116.34627789412102]
This paper presents a standardized equation of the learning objective, that offers a unifying understanding of diverse ML algorithms. It also provides guidance for mechanic design of new ML solutions, and serves as a promising vehicle towards panoramic learning with all experiences.
arXiv Detail & Related papers (2021-08-17T17:44:38Z)
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce [38.9878151656255]
K-PLUG is a knowledge-injected pre-trained language model based on the encoder-decoder transformer. We propose five knowledge-aware self-supervised pre-training objectives to formulate the learning of domain-specific knowledge.
arXiv Detail & Related papers (2021-04-14T16:37:31Z)
Scalable bundling via dense product embeddings [1.933681537640272]
Bundling is the practice of jointly selling two or more products at a discount. We develop a new machine-learning-driven methodology for designing bundles in a large-scale, cross-category retail setting. We find that our embeddings-baseds are strong predictors of bundle success, robust across product, and generalize well to the retailer's entire assortment.
arXiv Detail & Related papers (2020-01-31T23:34:56Z)
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.