MLRIP: Pre-training a military language representation model with
informative factual knowledge and professional knowledge base
- URL: http://arxiv.org/abs/2207.13929v1
- Date: Thu, 28 Jul 2022 07:39:30 GMT
- Title: MLRIP: Pre-training a military language representation model with
informative factual knowledge and professional knowledge base
- Authors: Hui Li, Xuekang Yang, Xin Zhao, Lin Yu, Jiping Zheng and Wei Sun
- Abstract summary: Current pre-training procedures usually inject external knowledge into models by using knowledge masking, knowledge fusion and knowledge replacement.
We propose MLRIP, which modifies the knowledge masking strategies proposed by ERNIE-Baidu, and introduce a two-stage entity replacement strategy.
Extensive experiments with comprehensive analyses illustrate the superiority of MLRIP over BERT-based models in military knowledge-driven NLP tasks.
- Score: 11.016827497014821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incorporating prior knowledge into pre-trained language models has proven to
be effective for knowledge-driven NLP tasks, such as entity typing and relation
extraction. Current pre-training procedures usually inject external knowledge
into models by using knowledge masking, knowledge fusion and knowledge
replacement. However, factual information contained in the input sentences have
not been fully mined, and the external knowledge for injecting have not been
strictly checked. As a result, the context information cannot be fully
exploited and extra noise will be introduced or the amount of knowledge
injected is limited. To address these issues, we propose MLRIP, which modifies
the knowledge masking strategies proposed by ERNIE-Baidu, and introduce a
two-stage entity replacement strategy. Extensive experiments with comprehensive
analyses illustrate the superiority of MLRIP over BERT-based models in military
knowledge-driven NLP tasks.
Related papers
- Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval [60.25608870901428]
Trustworthiness is a core research challenge for agentic AI systems built on Large Language Models (LLMs)<n>We propose the task of fact-checking without retrieval, focusing on the verification of arbitrary natural language claims, independent of their source robustness.
arXiv Detail & Related papers (2026-03-05T18:42:51Z) - Ontology-to-tools compilation for executable semantic constraint enforcement in LLM agents [0.0]
We present a proof-of-principle mechanism for coupling large language models (LLMs) with formal domain knowledge semantics.<n>Ontological specifications are compiled into executable tool tools that LLM-based agents must use to create and modify knowledge graph instances.<n>We show how executable ontological semantics guide LLM interfaces and reduce manual schema and prompt engineering, establishing a general paradigm for embedding formal knowledge into generative systems.
arXiv Detail & Related papers (2026-02-03T12:03:26Z) - Generative Human-Object Interaction Detection via Differentiable Cognitive Steering of Multi-modal LLMs [85.69785384599827]
Human-object interaction (HOI) detection aims to localize human-object pairs and the interactions between them.<n>Existing methods operate under a closed-world assumption, treating the task as a classification problem over a small, predefined verb set.<n>We propose GRASP-HO, a novel Generative Reasoning And Steerable Perception framework that reformulates HOI detection from the closed-set classification task to the open-vocabulary generation problem.
arXiv Detail & Related papers (2025-12-19T14:41:50Z) - Integrating Domain Knowledge into Process Discovery Using Large Language Models [3.7448613209842967]
We propose an interactive framework that incorporates domain knowledge, expressed in natural language, into the process discovery pipeline.<n>The framework coordinates interactions among the Large Language Models (LLMs), domain experts, and a set of backend services.<n>Our empirical study includes a case study based on a real-life event log with the involvement of domain experts, who assessed the usability and effectiveness of the framework.
arXiv Detail & Related papers (2025-10-08T15:59:11Z) - From Semantics, Scene to Instance-awareness: Distilling Foundation Model for Open-vocabulary Situation Recognition [14.16399307533106]
Multimodal Large Language Models (MLLMs) exhibit strong zero-shot abilities but struggle with complex Grounded Situation Recognition (GSR)<n>We exploit transferring knowledge from a teacher MLLM to a small GSR model to enhance its generalization and zero-shot abilities.<n>We propose Multimodal Interactive Prompt Distillation (MIPD), a novel framework that distills enriched multimodal knowledge from the foundation model.
arXiv Detail & Related papers (2025-07-19T16:29:02Z) - Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work [0.456877715768796]
Knowledge Protocol Engineering (KPE) is a new paradigm focused on systematically translating human expert knowledge into a machine-executable Knowledge Protocol.<n>We argue that a well-engineered Knowledge Protocol allows a generalist LLM to function as a specialist, capable of decomposing abstract queries and executing complex, multi-step tasks.
arXiv Detail & Related papers (2025-07-03T16:21:14Z) - Keep the General, Inject the Specific: Structured Dialogue Fine-Tuning for Knowledge Injection without Catastrophic Forgetting [24.67373225584835]
Large Vision Language Models have demonstrated impressive versatile capabilities through extensive multimodal pre-training.<n>These models struggle with a fundamental dilemma: direct adaptation approaches that inject domain-specific knowledge often trigger catastrophic forgetting of foundational visual-linguistic abilities.<n>We introduce Structured Dialogue Fine-Tuning (SDFT), an effective approach that effectively injects domain-specific knowledge while minimizing catastrophic forgetting.
arXiv Detail & Related papers (2025-04-27T18:04:02Z) - MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM)<n>MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task.<n>LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration [58.61492157691623]
Methods for integrating knowledge have been developed, which augment LLMs with domain-specific knowledge graphs through external modules.
Our research focuses on a novel problem: efficiently integrating unknown knowledge into LLMs without unnecessary overlap of known knowledge.
A risk of introducing new knowledge is the potential forgetting of existing knowledge.
arXiv Detail & Related papers (2024-02-18T03:36:26Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks.
However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge.
We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Knowledge Prompting in Pre-trained Language Model for Natural Language
Understanding [24.315130086787374]
We propose a knowledge-prompting-based PLM framework KP-PLM.
This framework can be flexibly combined with existing mainstream PLMs.
To further leverage the factual knowledge from these prompts, we propose two novel knowledge-aware self-supervised tasks.
arXiv Detail & Related papers (2022-10-16T13:36:57Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for
Natural Language Understanding [19.478288026844893]
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.
Previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs.
We propose a novel KEPLM named DKPLM that Decomposes Knowledge injection process of the Pre-trained Language Models in pre-training, fine-tuning and inference stages.
arXiv Detail & Related papers (2021-12-02T08:19:42Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.