K-PLUG: Knowledge-injected Pre-trained Language Model for Natural
Language Understanding and Generation in E-Commerce
- URL: http://arxiv.org/abs/2104.06960v1
- Date: Wed, 14 Apr 2021 16:37:31 GMT
- Title: K-PLUG: Knowledge-injected Pre-trained Language Model for Natural
Language Understanding and Generation in E-Commerce
- Authors: Song Xu, Haoran Li, Peng Yuan, Yujia Wang, Youzheng Wu, Xiaodong He,
Ying Liu, Bowen Zhou
- Abstract summary: K-PLUG is a knowledge-injected pre-trained language model based on the encoder-decoder transformer.
We propose five knowledge-aware self-supervised pre-training objectives to formulate the learning of domain-specific knowledge.
- Score: 38.9878151656255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing pre-trained language models (PLMs) have demonstrated the
effectiveness of self-supervised learning for a broad range of natural language
processing (NLP) tasks. However, most of them are not explicitly aware of
domain-specific knowledge, which is essential for downstream tasks in many
domains, such as tasks in e-commerce scenarios. In this paper, we propose
K-PLUG, a knowledge-injected pre-trained language model based on the
encoder-decoder transformer that can be transferred to both natural language
understanding and generation tasks. We verify our method in a diverse range of
e-commerce scenarios that require domain-specific knowledge. Specifically, we
propose five knowledge-aware self-supervised pre-training objectives to
formulate the learning of domain-specific knowledge, including e-commerce
domain-specific knowledge-bases, aspects of product entities, categories of
product entities, and unique selling propositions of product entities. K-PLUG
achieves new state-of-the-art results on a suite of domain-specific NLP tasks,
including product knowledge base completion, abstractive product summarization,
and multi-turn dialogue, significantly outperforms baselines across the board,
which demonstrates that the proposed method effectively learns a diverse set of
domain-specific knowledge for both language understanding and generation tasks.
Related papers
- Knowledge Tagging with Large Language Model based Multi-Agent System [17.53518487546791]
This paper investigates the use of a multi-agent system to address the limitations of previous algorithms.
We highlight the significant potential of an LLM-based multi-agent system in overcoming the challenges that previous methods have encountered.
arXiv Detail & Related papers (2024-09-12T21:39:01Z) - EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models
with Semi-structured Data [67.8302955948861]
Large Language Models (LLMs) pre-trained on massive corpora have exhibited remarkable performance on various NLP tasks.
Applying these models to specific domains still poses significant challenges, such as lack of domain knowledge.
We focus on domain-specific continual pre-training of LLMs using E-commerce domain as an exemplar.
arXiv Detail & Related papers (2023-12-25T11:31:47Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue
Systems [9.983102639594899]
Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications.
They lack domain-specific knowledge that does not naturally occur in pre-training data.
Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks.
arXiv Detail & Related papers (2022-12-15T20:15:05Z) - Knowledge Based Multilingual Language Model [44.70205282863062]
We present a novel framework to pretrain knowledge based multilingual language models (KMLMs)
We generate a large amount of code-switched synthetic sentences and reasoning-based multilingual training data using the Wikidata knowledge graphs.
Based on the intra- and inter-sentence structures of the generated data, we design pretraining tasks to facilitate knowledge learning.
arXiv Detail & Related papers (2021-11-22T02:56:04Z) - K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for
Question Answering [8.772466918885224]
We propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge.
Instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge.
We conducted experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce.
arXiv Detail & Related papers (2021-09-22T07:19:08Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z) - E-BERT: A Phrase and Product Knowledge Enhanced Language Model for
E-commerce [63.333860695727424]
E-commerce tasks require accurate understanding of domain phrases, whereas such fine-grained phrase-level knowledge is not explicitly modeled by BERT's training objective.
To tackle the problem, we propose a unified pre-training framework, namely, E-BERT.
Specifically, to preserve phrase-level knowledge, we introduce Adaptive Hybrid Masking, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases.
To utilize product-level knowledge, we introduce Neighbor Product Reconstruction, which trains E-BERT to predict a product's associated neighbors with a denoising cross attention layer
arXiv Detail & Related papers (2020-09-07T00:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.