Pretrained Embeddings for E-commerce Machine Learning: When it Fails and
Why?
- URL: http://arxiv.org/abs/2304.04330v1
- Date: Sun, 9 Apr 2023 23:55:47 GMT
- Title: Pretrained Embeddings for E-commerce Machine Learning: When it Fails and
Why?
- Authors: Da Xu, Bo Yang
- Abstract summary: We investigate the use of pretrained embeddings in e-commerce applications.
We find that there is a lack of a thorough understanding of how pre-trained embeddings work.
We establish a principled perspective of pre-trained embeddings via the lens of kernel analysis.
- Score: 18.192733659176806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of pretrained embeddings has become widespread in modern e-commerce
machine learning (ML) systems. In practice, however, we have encountered
several key issues when using pretrained embedding in a real-world production
system, many of which cannot be fully explained by current knowledge.
Unfortunately, we find that there is a lack of a thorough understanding of how
pre-trained embeddings work, especially their intrinsic properties and
interactions with downstream tasks. Consequently, it becomes challenging to
make interactive and scalable decisions regarding the use of pre-trained
embeddings in practice.
Our investigation leads to two significant discoveries about using pretrained
embeddings in e-commerce applications. Firstly, we find that the design of the
pretraining and downstream models, particularly how they encode and decode
information via embedding vectors, can have a profound impact. Secondly, we
establish a principled perspective of pre-trained embeddings via the lens of
kernel analysis, which can be used to evaluate their predictability,
interactively and scalably. These findings help to address the practical
challenges we faced and offer valuable guidance for successful adoption of
pretrained embeddings in real-world production. Our conclusions are backed by
solid theoretical reasoning, benchmark experiments, as well as online testings.
Related papers
- What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration.
We identify the critical limitations of regression-based methods with the widely used data generation pipeline.
We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z) - Learner Attentiveness and Engagement Analysis in Online Education Using Computer Vision [3.449808359602251]
This research presents a computer vision-based approach to analyze and quantify learners' attentiveness, engagement, and other affective states within online learning scenarios.
A machine learning-based algorithm is developed on top of the classification model that outputs a comprehensive attentiveness index of the learners.
An end-to-end pipeline is proposed through which learners' live video feed is processed, providing detailed attentiveness analytics of the learners to the instructors.
arXiv Detail & Related papers (2024-11-30T10:54:08Z) - Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Tensorflow Pretrained Models [17.372501468675303]
The study covers modern architectures, including ResNet, MobileNet, and EfficientNet.
A comparison of linear probing and model fine-tuning is presented, supplemented by visualizations using techniques like PCA, t-SNE, and UMAP.
arXiv Detail & Related papers (2024-09-20T15:07:14Z) - Towards Automated Knowledge Integration From Human-Interpretable Representations [55.2480439325792]
We introduce and motivate theoretically the principles of informed meta-learning enabling automated and controllable inductive bias selection.
We empirically demonstrate the potential benefits and limitations of informed meta-learning in improving data efficiency and generalisation.
arXiv Detail & Related papers (2024-02-25T15:08:37Z) - Machine Unlearning of Pre-trained Large Language Models [17.40601262379265]
This study investigates the concept of the right to be forgotten' within the context of large language models (LLMs)
We explore machine unlearning as a pivotal solution, with a focus on pre-trained models.
arXiv Detail & Related papers (2024-02-23T07:43:26Z) - An Emulator for Fine-Tuning Large Language Models using Small Language
Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales.
We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training.
Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [65.57123249246358]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - A critical look at the current train/test split in machine learning [6.475859946760842]
We take a closer look at the split protocol itself and point out its weakness and limitation.
In many real-world problems, we must acknowledge that there are numerous situations where assumption (ii) does not hold.
We propose a new adaptive active learning architecture (AAL) which involves an adaptation policy.
arXiv Detail & Related papers (2021-06-08T17:07:20Z) - Adversarial Training is Not Ready for Robot Learning [55.493354071227174]
Adversarial training is an effective method to train deep learning models that are resilient to norm-bounded perturbations.
We show theoretically and experimentally that neural controllers obtained via adversarial training are subjected to three types of defects.
Our results suggest that adversarial training is not yet ready for robot learning.
arXiv Detail & Related papers (2021-03-15T07:51:31Z) - Theoretical Understandings of Product Embedding for E-commerce Machine
Learning [18.204325860752768]
We take an e-commerce-oriented view of the product embeddings and reveal a complete theoretical view from both the representation learning and the learning theory perspective.
We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm are sufficient dimension reduction regarding a critical product relatedness measure.
The generalization performance in the downstream machine learning task is controlled by the alignment between the embeddings and the product relatedness measure.
arXiv Detail & Related papers (2021-02-24T02:29:15Z) - Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
Injection into Pretrained Transformers [54.417299589288184]
We investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus.
Our adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS.
arXiv Detail & Related papers (2020-05-24T15:49:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.