E-BERT: A Phrase and Product Knowledge Enhanced Language Model for
E-commerce
- URL: http://arxiv.org/abs/2009.02835v3
- Date: Fri, 17 Dec 2021 18:26:46 GMT
- Title: E-BERT: A Phrase and Product Knowledge Enhanced Language Model for
E-commerce
- Authors: Denghui Zhang, Zixuan Yuan, Yanchi Liu, Fuzhen Zhuang, Haifeng Chen,
Hui Xiong
- Abstract summary: E-commerce tasks require accurate understanding of domain phrases, whereas such fine-grained phrase-level knowledge is not explicitly modeled by BERT's training objective.
To tackle the problem, we propose a unified pre-training framework, namely, E-BERT.
Specifically, to preserve phrase-level knowledge, we introduce Adaptive Hybrid Masking, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases.
To utilize product-level knowledge, we introduce Neighbor Product Reconstruction, which trains E-BERT to predict a product's associated neighbors with a denoising cross attention layer
- Score: 63.333860695727424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models such as BERT have achieved great success in a
broad range of natural language processing tasks. However, BERT cannot well
support E-commerce related tasks due to the lack of two levels of domain
knowledge, i.e., phrase-level and product-level. On one hand, many E-commerce
tasks require an accurate understanding of domain phrases, whereas such
fine-grained phrase-level knowledge is not explicitly modeled by BERT's
training objective. On the other hand, product-level knowledge like product
associations can enhance the language modeling of E-commerce, but they are not
factual knowledge thus using them indiscriminately may introduce noise. To
tackle the problem, we propose a unified pre-training framework, namely,
E-BERT. Specifically, to preserve phrase-level knowledge, we introduce Adaptive
Hybrid Masking, which allows the model to adaptively switch from learning
preliminary word knowledge to learning complex phrases, based on the fitting
progress of two modes. To utilize product-level knowledge, we introduce
Neighbor Product Reconstruction, which trains E-BERT to predict a product's
associated neighbors with a denoising cross attention layer. Our investigation
reveals promising results in four downstream tasks, i.e., review-based question
answering, aspect extraction, aspect sentiment classification, and product
classification.
Related papers
- Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - DictBERT: Dictionary Description Knowledge Enhanced Language Model
Pre-training via Contrastive Learning [18.838291575019504]
Pre-trained language models (PLMs) are shown to be lacking in knowledge when dealing with knowledge driven tasks.
We propose textbfDictBERT, a novel approach that enhances PLMs with dictionary knowledge.
We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.
arXiv Detail & Related papers (2022-08-01T06:43:19Z) - Knowledge Graph Fusion for Language Model Fine-tuning [0.0]
We investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT.
An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language.
Changes made to K-BERT for accommodating the English language also extend to other word-based languages.
arXiv Detail & Related papers (2022-06-21T08:06:22Z) - Knowledgeable Salient Span Mask for Enhancing Language Models as
Knowledge Base [51.55027623439027]
We develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner.
To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training.
arXiv Detail & Related papers (2022-04-17T12:33:34Z) - Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning
for Low-Resource Speech Recognition [159.9312272042253]
Wav-BERT is a cooperative acoustic and linguistic representation learning method.
We unify a pre-trained acoustic model (wav2vec 2.0) and a language model (BERT) into an end-to-end trainable framework.
arXiv Detail & Related papers (2021-09-19T16:39:22Z) - K-PLUG: Knowledge-injected Pre-trained Language Model for Natural
Language Understanding and Generation in E-Commerce [38.9878151656255]
K-PLUG is a knowledge-injected pre-trained language model based on the encoder-decoder transformer.
We propose five knowledge-aware self-supervised pre-training objectives to formulate the learning of domain-specific knowledge.
arXiv Detail & Related papers (2021-04-14T16:37:31Z) - Cross-Lingual Low-Resource Set-to-Description Retrieval for Global
E-Commerce [83.72476966339103]
Cross-lingual information retrieval is a new task in cross-border e-commerce.
We propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping.
Experimental results indicate that our proposed CLMN yields impressive results on the challenging task.
arXiv Detail & Related papers (2020-05-17T08:10:51Z) - lamBERT: Language and Action Learning Using Multimodal BERT [0.1942428068361014]
This study proposes the language and action learning using multimodal BERT (lamBERT) model.
Experiment is conducted in a grid environment that requires language understanding for the agent to act properly.
The lamBERT model obtained higher rewards in multitask settings and transfer settings when compared to other models.
arXiv Detail & Related papers (2020-04-15T13:54:55Z) - Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue [51.513276162736844]
We propose a sequential latent variable model as the first approach to this matter.
The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge.
arXiv Detail & Related papers (2020-02-18T11:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.