Related papers: Plausible-Parrots @ MSP2023: Enhancing Semantic Plausibility Modeling using Entity and Event Knowledge

Plausible-Parrots @ MSP2023: Enhancing Semantic Plausibility Modeling using Entity and Event Knowledge

URL: http://arxiv.org/abs/2408.16937v1
Date: Thu, 29 Aug 2024 23:13:45 GMT
Title: Plausible-Parrots @ MSP2023: Enhancing Semantic Plausibility Modeling using Entity and Event Knowledge
Authors: Chong Shen, Chenyue Zhou,
Abstract summary: We enhance the large language model (LLM) with fine-grained entity types, event types and their definitions extracted from an external knowledge base. The experimental results show the effectiveness of the injected knowledge on modeling semantic plausibility of events.
Score: 1.6233244703352492
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we investigate the effectiveness of injecting external knowledge to a large language model (LLM) to identify semantic plausibility of simple events. Specifically, we enhance the LLM with fine-grained entity types, event types and their definitions extracted from an external knowledge base. These knowledge are injected into our system via designed templates. We also augment the data to balance the label distribution and adapt the task setting to real world scenarios in which event mentions are expressed as natural language sentences. The experimental results show the effectiveness of the injected knowledge on modeling semantic plausibility of events. An error analysis further emphasizes the importance of identifying non-trivial entity and event types.

Related papers

The Imperfective Paradox in Large Language Models [19.058068907991277]
We investigate the Imperfective Paradox, where the past progressive aspect entails event realization for activities but not for accomplishments.<n>We introduce ImperfectiveNLI, a diagnostic dataset designed to probe this distinction across diverse semantic classes.<n>We uncover a pervasive Teleological Bias: models systematically hallucinate completion for goal-oriented events, often overriding explicit textual negation.
arXiv Detail & Related papers (2026-01-14T10:57:16Z)
Plausibility Vaccine: Injecting LLM Knowledge for Event Plausibility [0.0]
We train 12 task adapters to learn various physical properties and association measures. We perform adapter fusion to compose latent semantic knowledge from each task on top of pre-trained AlBERT embeddings.
arXiv Detail & Related papers (2025-03-16T21:55:17Z)
Enhancing Argument Structure Extraction with Efficient Leverage of Contextual Information [79.06082391992545]
We propose an Efficient Context-aware model (ECASE) that fully exploits contextual information. We introduce a sequence-attention module and distance-weighted similarity loss to aggregate contextual information and argumentative information. Our experiments on five datasets from various domains demonstrate that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-08T08:47:10Z)
Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial. We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments. The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z)
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts) We find that existing methods for updating knowledge show little propagation of injected knowledge. Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z)
EvEntS ReaLM: Event Reasoning of Entity States via Language Models [24.077262847151232]
Nominally, Large Language models (LLM) have been exposed to procedural knowledge about how objects interact, yet our benchmarking shows they fail to reason about the world. In particular, our results indicate that our prompting technique is especially useful for unseen attributes (out-of-domain) or when only limited data is available.
arXiv Detail & Related papers (2022-11-10T07:48:01Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP) What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)
Semantic Pivoting Model for Effective Event Detection [19.205550116466604]
Event Detection aims to identify and classify mentions of event instances from unstructured articles. Existing techniques for event detection only use homogeneous one-hot vectors to represent the event type classes, ignoring the fact that the semantic meaning of the types is important to the task. We propose a Semantic Pivoting Model for Effective Event Detection (SPEED), which explicitly incorporates prior information during training and captures semantically meaningful correlations between input and events.
arXiv Detail & Related papers (2022-11-01T19:20:34Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Did the Cat Drink the Coffee? Challenging Transformers with Generalized Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit Our results show that TLMs can reach performances that are comparable to those achieved by SDM. However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z)
A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English [17.993417004424078]
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT) testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks.
arXiv Detail & Related papers (2020-11-02T13:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.