Adversarial Transformer Language Models for Contextual Commonsense
Inference
- URL: http://arxiv.org/abs/2302.05406v1
- Date: Fri, 10 Feb 2023 18:21:13 GMT
- Title: Adversarial Transformer Language Models for Contextual Commonsense
Inference
- Authors: Pedro Colon-Hernandez, Henry Lieberman, Yida Xin, Claire Yin, Cynthia
Breazeal, Peter Chin
- Abstract summary: Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions.
Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training.
We develop techniques to address the aforementioned problems in the task.
- Score: 14.12019824666882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contextualized or discourse aware commonsense inference is the task of
generating coherent commonsense assertions (i.e., facts) from a given story,
and a particular sentence from that story. Some problems with the task are:
lack of controllability for topics of the inferred facts; lack of commonsense
knowledge during training; and, possibly, hallucinated or false facts. In this
work, we utilize a transformer model for this task and develop techniques to
address the aforementioned problems in the task. We control the inference by
introducing a new technique we call "hinting". Hinting is a kind of language
model prompting, that utilizes both hard prompts (specific words) and soft
prompts (virtual learnable templates). This serves as a control signal to
advise the language model "what to talk about". Next, we establish a
methodology for performing joint inference with multiple commonsense knowledge
bases. Joint inference of commonsense requires care, because it is imprecise
and the level of generality is more flexible. You want to be sure that the
results "still make sense" for the context. To this end, we align the textual
version of assertions from three knowledge graphs (ConceptNet, ATOMIC2020, and
GLUCOSE) with a story and a target sentence. This combination allows us to
train a single model to perform joint inference with multiple knowledge graphs.
We show experimental results for the three knowledge graphs on joint inference.
Our final contribution is exploring a GAN architecture that generates the
contextualized commonsense assertions and scores them as to their plausibility
through a discriminator. The result is an integrated system for contextual
commonsense inference in stories, that can controllably generate plausible
commonsense assertions, and takes advantage of joint inference between multiple
commonsense knowledge bases.
Related papers
- Can Language Models Take A Hint? Prompting for Controllable Contextualized Commonsense Inference [12.941933077524919]
We introduce "hinting," a data augmentation technique that enhances contextualized commonsense inference.
"Hinting" employs a prefix prompting strategy using both hard and soft prompts to guide the inference process.
Our results show that "hinting" does not compromise the performance of contextual commonsense inference while offering improved controllability.
arXiv Detail & Related papers (2024-10-03T04:32:46Z) - Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs [61.796960984541464]
We present COM2 (COMplex COMmonsense), a new dataset created by sampling logical queries.
We verbalize them using handcrafted rules and large language models into multiple-choice and text generation questions.
Experiments show that language models trained on COM2 exhibit significant improvements in complex reasoning ability.
arXiv Detail & Related papers (2024-03-12T08:13:52Z) - DiffuCOMET: Contextual Commonsense Knowledge Diffusion [29.23102821128395]
In this work, we develop a series of knowledge models, DiffuCOMET, that leverage diffusion to learn to reconstruct the implicit semantic connections between narrative contexts and relevant commonsense knowledge.
To evaluate DiffuCOMET, we introduce new metrics for commonsense inference that more closely measure knowledge diversity and contextual relevance.
Our results on two different benchmarks, ComFact and WebNLG+, show that knowledge generated by DiffuCOMET achieves a better trade-off between commonsense diversity, contextual relevance and alignment to known gold references.
arXiv Detail & Related papers (2024-02-26T20:35:34Z) - MICO: A Multi-alternative Contrastive Learning Framework for Commonsense
Knowledge Representation [52.238466443561705]
MICO is a multi-alternative contrastve learning framework on COmmonsense knowledge graphs.
It generates the commonsense knowledge representation by contextual interaction between entity nodes.
It can benefit the following two tasks by simply comparing the distance score between the representations.
arXiv Detail & Related papers (2022-10-14T06:51:21Z) - CIKQA: Learning Commonsense Inference with a Unified
Knowledge-in-the-loop QA Paradigm [120.98789964518562]
We argue that due to the large scale of commonsense knowledge, it is infeasible to annotate a large enough training set for each task to cover all commonsense for learning.
We focus on investigating models' commonsense inference capabilities from two perspectives.
We name the benchmark as Commonsense Inference with Knowledge-in-the-loop Question Answering (CIKQA)
arXiv Detail & Related papers (2022-10-12T14:32:39Z) - CIS2: A Simplified Commonsense Inference Evaluation for Story Prose [21.32351425259654]
We look at the domain of commonsense reasoning within story prose, which we call contextual commonsense inference (CCI)
We introduce the task contextual commonsense inference in sentence selection (CIS$2$), a simplified task that avoids conflation by eliminating language generation altogether.
arXiv Detail & Related papers (2022-02-16T06:14:37Z) - GreaseLM: Graph REASoning Enhanced Language Models for Question
Answering [159.9645181522436]
GreaseLM is a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations.
We show that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.
arXiv Detail & Related papers (2022-01-21T19:00:05Z) - Paragraph-level Commonsense Transformers with Recurrent Memory [77.4133779538797]
We train a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives.
Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.
arXiv Detail & Related papers (2020-10-04T05:24:12Z) - Improving Machine Reading Comprehension with Contextualized Commonsense
Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension.
We propose to represent relations implicitly by situating structured knowledge in a context.
We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.