Why Do Neural Language Models Still Need Commonsense Knowledge to Handle
Semantic Variations in Question Answering?
- URL: http://arxiv.org/abs/2209.00599v1
- Date: Thu, 1 Sep 2022 17:15:02 GMT
- Title: Why Do Neural Language Models Still Need Commonsense Knowledge to Handle
Semantic Variations in Question Answering?
- Authors: Sunjae Kwon, Cheongwoong Kang, Jiyeon Han, Jaesik Choi
- Abstract summary: Masked neural language models (MNLMs) are made up of huge neural network structures and trained to restore the masked text.
This paper provides new insights and empirical analyses on commonsense knowledge included in pretrained MNLMs.
- Score: 22.536777694218593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many contextualized word representations are now learned by intricate neural
network models, such as masked neural language models (MNLMs) which are made up
of huge neural network structures and trained to restore the masked text. Such
representations demonstrate superhuman performance in some reading
comprehension (RC) tasks which extract a proper answer in the context given a
question. However, identifying the detailed knowledge trained in MNLMs is
challenging owing to numerous and intermingled model parameters. This paper
provides new insights and empirical analyses on commonsense knowledge included
in pretrained MNLMs. First, we use a diagnostic test that evaluates whether
commonsense knowledge is properly trained in MNLMs. We observe that a large
proportion of commonsense knowledge is not appropriately trained in MNLMs and
MNLMs do not often understand the semantic meaning of relations accurately. In
addition, we find that the MNLM-based RC models are still vulnerable to
semantic variations that require commonsense knowledge. Finally, we discover
the fundamental reason why some knowledge is not trained. We further suggest
that utilizing an external commonsense knowledge repository can be an effective
solution. We exemplify the possibility to overcome the limitations of the
MNLM-based RC models by enriching text with the required knowledge from an
external commonsense knowledge repository in controlled experiments.
Related papers
- Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs [55.317267269115845]
Chain-of-Knowledge (CoK) is a comprehensive framework for knowledge reasoning.
CoK includes methodologies for both dataset construction and model learning.
We conduct extensive experiments with KnowReason.
arXiv Detail & Related papers (2024-06-30T10:49:32Z) - IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons [35.932259793728]
Large language models (LLMs) encode a vast reservoir of knowledge after being trained on mass data.
Recent studies disclose knowledge conflicts in LLM generation, wherein outdated or incorrect parametric knowledge contradicts new knowledge provided in the context.
We propose a novel framework, IRCAN, to capitalize on neurons that are crucial in processing contextual cues.
arXiv Detail & Related papers (2024-06-26T14:57:38Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - What's in an embedding? Would a rose by any embedding smell as sweet? [0.0]
Large Language Models (LLMs) are often criticized for lacking true "understanding" and the ability to "reason" with their knowledge.
We suggest that LLMs do develop a kind of empirical "understanding" that is "geometry"-like, which seems adequate for a range of applications in NLP.
To overcome these limitations, we suggest that LLMs should be integrated with an "algebraic" representation of knowledge that includes symbolic AI elements.
arXiv Detail & Related papers (2024-06-11T01:10:40Z) - Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs [58.09253149867228]
This paper assesses the domain knowledge of LLMs through its understanding of different mathematical skills required to solve problems.
Motivated by the use of LLMs as a general scientific assistant, we propose textitNTKEval to assess changes in LLM's probability distribution.
Our systematic analysis finds evidence of domain understanding during in-context learning.
Certain instruction-tuning leads to similar performance changes irrespective of training on different data, suggesting a lack of domain understanding across different skills.
arXiv Detail & Related papers (2024-05-24T12:04:54Z) - LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements [59.71218039095155]
Task of reading comprehension (RC) provides a primary means to assess language models' natural language understanding (NLU) capabilities.
If the context aligns with the models' internal knowledge, it is hard to discern whether the models' answers stem from context comprehension or from internal information.
To address this issue, we suggest to use RC on imaginary data, based on fictitious facts and entities.
arXiv Detail & Related papers (2024-04-09T13:08:56Z) - Knowledge Solver: Teaching LLMs to Search for Domain Knowledge from
Knowledge Graphs [19.0797968186656]
Large language models (LLMs) are versatile and can solve different tasks due to their emergent ability and generalizability.
In some previous works, additional modules like graph neural networks (GNNs) are trained on retrieved knowledge from external knowledge bases.
arXiv Detail & Related papers (2023-09-06T15:55:01Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - Empowering Language Models with Knowledge Graph Reasoning for Question
Answering [117.79170629640525]
We propose knOwledge REasOning empowered Language Model (OREO-LM)
OREO-LM consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs.
We show significant performance gain, achieving state-of-art results in the Closed-Book setting.
arXiv Detail & Related papers (2022-11-15T18:26:26Z) - Knowledge Authoring with Factual English [0.0]
Knowledge representation and reasoning (KRR) systems represent knowledge as collections of facts and rules.
One solution could be to extract knowledge from English text, and a number of works have attempted to do so.
Unfortunately, extraction of logical facts from unrestricted natural language is still too inaccurate to be used for reasoning.
Recent CNL-based approaches, such as the Knowledge Authoring Logic Machine (KALM), have shown to have very high accuracy compared to others.
arXiv Detail & Related papers (2022-08-05T10:49:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.