Related papers: Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?

Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?

URL: http://arxiv.org/abs/2304.04330v1
Date: Sun, 9 Apr 2023 23:55:47 GMT
Title: Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?
Authors: Da Xu, Bo Yang
Abstract summary: We investigate the use of pretrained embeddings in e-commerce applications. We find that there is a lack of a thorough understanding of how pre-trained embeddings work. We establish a principled perspective of pre-trained embeddings via the lens of kernel analysis.
Score: 18.192733659176806
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The use of pretrained embeddings has become widespread in modern e-commerce machine learning (ML) systems. In practice, however, we have encountered several key issues when using pretrained embedding in a real-world production system, many of which cannot be fully explained by current knowledge. Unfortunately, we find that there is a lack of a thorough understanding of how pre-trained embeddings work, especially their intrinsic properties and interactions with downstream tasks. Consequently, it becomes challenging to make interactive and scalable decisions regarding the use of pre-trained embeddings in practice. Our investigation leads to two significant discoveries about using pretrained embeddings in e-commerce applications. Firstly, we find that the design of the pretraining and downstream models, particularly how they encode and decode information via embedding vectors, can have a profound impact. Secondly, we establish a principled perspective of pre-trained embeddings via the lens of kernel analysis, which can be used to evaluate their predictability, interactively and scalably. These findings help to address the practical challenges we faced and offer valuable guidance for successful adoption of pretrained embeddings in real-world production. Our conclusions are backed by solid theoretical reasoning, benchmark experiments, as well as online testings.

Related papers

What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration. We identify the critical limitations of regression-based methods with the widely used data generation pipeline. We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z)
Learner Attentiveness and Engagement Analysis in Online Education Using Computer Vision [3.449808359602251]
This research presents a computer vision-based approach to analyze and quantify learners' attentiveness, engagement, and other affective states within online learning scenarios. A machine learning-based algorithm is developed on top of the classification model that outputs a comprehensive attentiveness index of the learners. An end-to-end pipeline is proposed through which learners' live video feed is processed, providing detailed attentiveness analytics of the learners to the instructors.
arXiv Detail & Related papers (2024-11-30T10:54:08Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Tensorflow Pretrained Models [17.372501468675303]
The book covers practical implementations of modern architectures like ResNet, MobileNet, and EfficientNet. It compares linear probing and model fine-tuning, offering visualizations using techniques such as PCA, t-SNE, and UMAP. By blending theoretical insights with hands-on practice, this book equips readers with the knowledge to confidently tackle various deep learning challenges.
arXiv Detail & Related papers (2024-09-20T15:07:14Z)
Informed Meta-Learning [55.2480439325792]
Meta-learning and informed ML stand out as two approaches for incorporating prior knowledge into ML pipelines. We formalise a hybrid paradigm, informed meta-learning, facilitating the incorporation of priors from unstructured knowledge representations. We demonstrate the potential benefits of informed meta-learning in improving data efficiency, robustness to observational noise and task distribution shifts.
arXiv Detail & Related papers (2024-02-25T15:08:37Z)
Machine Unlearning of Pre-trained Large Language Models [17.40601262379265]
This study investigates the concept of the right to be forgotten' within the context of large language models (LLMs) We explore machine unlearning as a pivotal solution, with a focus on pre-trained models.
arXiv Detail & Related papers (2024-02-23T07:43:26Z)
An Emulator for Fine-Tuning Large Language Models using Small Language Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales. We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training. Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z)
PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT. On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt. On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z)
A critical look at the current train/test split in machine learning [6.475859946760842]
We take a closer look at the split protocol itself and point out its weakness and limitation. In many real-world problems, we must acknowledge that there are numerous situations where assumption (ii) does not hold. We propose a new adaptive active learning architecture (AAL) which involves an adaptation policy.
arXiv Detail & Related papers (2021-06-08T17:07:20Z)
Adversarial Training is Not Ready for Robot Learning [55.493354071227174]
Adversarial training is an effective method to train deep learning models that are resilient to norm-bounded perturbations. We show theoretically and experimentally that neural controllers obtained via adversarial training are subjected to three types of defects. Our results suggest that adversarial training is not yet ready for robot learning.
arXiv Detail & Related papers (2021-03-15T07:51:31Z)
Theoretical Understandings of Product Embedding for E-commerce Machine Learning [18.204325860752768]
We take an e-commerce-oriented view of the product embeddings and reveal a complete theoretical view from both the representation learning and the learning theory perspective. We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm are sufficient dimension reduction regarding a critical product relatedness measure. The generalization performance in the downstream machine learning task is controlled by the alignment between the embeddings and the product relatedness measure.
arXiv Detail & Related papers (2021-02-24T02:29:15Z)
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers [54.417299589288184]
We investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus. Our adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS.
arXiv Detail & Related papers (2020-05-24T15:49:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.