An End-to-End Solution for Named Entity Recognition in eCommerce Search
- URL: http://arxiv.org/abs/2012.07553v1
- Date: Fri, 11 Dec 2020 04:58:13 GMT
- Title: An End-to-End Solution for Named Entity Recognition in eCommerce Search
- Authors: Xiang Cheng, Mitchell Bowden, Bhushan Ramesh Bhange, Priyanka Goyal,
Thomas Packer, Faizan Javed
- Abstract summary: Named entity recognition (NER) is a critical step in modern search query understanding.
Recent research shows promising results on shared benchmark NER tasks using deep learning methods.
This paper demonstrates an end-to-end solution to address these challenges.
- Score: 7.240345005177374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named entity recognition (NER) is a critical step in modern search query
understanding. In the domain of eCommerce, identifying the key entities, such
as brand and product type, can help a search engine retrieve relevant products
and therefore offer an engaging shopping experience. Recent research shows
promising results on shared benchmark NER tasks using deep learning methods,
but there are still unique challenges in the industry regarding domain
knowledge, training data, and model production. This paper demonstrates an
end-to-end solution to address these challenges. The core of our solution is a
novel model training framework "TripleLearn" which iteratively learns from
three separate training datasets, instead of one training set as is
traditionally done. Using this approach, the best model lifts the F1 score from
69.5 to 93.3 on the holdout test data. In our offline experiments, TripleLearn
improved the model performance compared to traditional training approaches
which use a single set of training data. Moreover, in the online A/B test, we
see significant improvements in user engagement and revenue conversion. The
model has been live on homedepot.com for more than 9 months, boosting search
conversions and revenue. Beyond our application, this TripleLearn framework, as
well as the end-to-end process, is model-independent and problem-independent,
so it can be generalized to more industrial applications, especially to the
eCommerce industry which has similar data foundations and problems.
Related papers
- ConDa: Fast Federated Unlearning with Contribution Dampening [46.074452659791575]
ConDa is a framework that performs efficient unlearning by tracking down the parameters which affect the global model for each client.
We perform experiments on multiple datasets and demonstrate that ConDa is effective to forget a client's data.
arXiv Detail & Related papers (2024-10-05T12:45:35Z) - Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions.
We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker.
We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z) - ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for
E-Commerce Product Search [4.220439000486713]
We propose a robust multilingual model to improve the quality of search results.
In pre-training stage, we adopt mlm task, classification task and contrastive learning task.
In fine-tuning stage, we use confident learning, exponential moving average method (EMA), adversarial training (FGM) and regularized dropout strategy (R-Drop)
arXiv Detail & Related papers (2023-01-31T07:31:34Z) - Commonsense Knowledge Salience Evaluation with a Benchmark Dataset in
E-commerce [42.726755541409545]
In e-commerce, the salience of commonsense knowledge (CSK) is beneficial for widespread applications such as product search and recommendation.
However, many existing CSK collections rank statements solely by confidence scores, and there is no information about which ones are salient from a human perspective.
In this work, we define the task of supervised salience evaluation, where given a CSK triple, the model is required to learn whether the triple is salient or not.
arXiv Detail & Related papers (2022-05-22T15:01:23Z) - Winning solutions and post-challenge analyses of the ChaLearn AutoDL
challenge 2019 [112.36155380260655]
This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series.
Results show that DL methods dominated, though popular Neural Architecture Search (NAS) was impractical.
A high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator"
arXiv Detail & Related papers (2022-01-11T06:21:18Z) - Ex-Model: Continual Learning from a Stream of Trained Models [12.27992745065497]
We argue that continual learning systems should exploit the availability of compressed information in the form of trained models.
We introduce and formalize a new paradigm named "Ex-Model Continual Learning" (ExML), where an agent learns from a sequence of previously trained models instead of raw data.
arXiv Detail & Related papers (2021-12-13T09:46:16Z) - What Stops Learning-based 3D Registration from Working in the Real
World? [53.68326201131434]
This work identifies the sources of 3D point cloud registration failures, analyze the reasons behind them, and propose solutions.
Ultimately, this translates to a best-practice 3D registration network (BPNet), constituting the first learning-based method able to handle previously-unseen objects in real-world data.
Our model generalizes to real data without any fine-tuning, reaching an accuracy of up to 67% on point clouds of unseen objects obtained with a commercial sensor.
arXiv Detail & Related papers (2021-11-19T19:24:27Z) - Imitate TheWorld: A Search Engine Simulation Platform [13.011052642314421]
We build a simulated search engine AESim that can properly give feedback by a well-trained discriminator for generated pages.
Different from previous simulation platforms which lose connection with the real world, ours depends on the real data in Search.
Our experiments also show AESim can better reflect the online performance of ranking models than classic ranking metrics.
arXiv Detail & Related papers (2021-07-16T03:55:33Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.