Can Language Models Take A Hint? Prompting for Controllable Contextualized Commonsense Inference
- URL: http://arxiv.org/abs/2410.02202v1
- Date: Thu, 3 Oct 2024 04:32:46 GMT
- Title: Can Language Models Take A Hint? Prompting for Controllable Contextualized Commonsense Inference
- Authors: Pedro Colon-Hernandez, Nanxi Liu, Chelsea Joe, Peter Chin, Claire Yin, Henry Lieberman, Yida Xin, Cynthia Breazeal,
- Abstract summary: We introduce "hinting," a data augmentation technique that enhances contextualized commonsense inference.
"Hinting" employs a prefix prompting strategy using both hard and soft prompts to guide the inference process.
Our results show that "hinting" does not compromise the performance of contextual commonsense inference while offering improved controllability.
- Score: 12.941933077524919
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating commonsense assertions within a given story context remains a difficult task for modern language models. Previous research has addressed this problem by aligning commonsense inferences with stories and training language generation models accordingly. One of the challenges is determining which topic or entity in the story should be the focus of an inferred assertion. Prior approaches lack the ability to control specific aspects of the generated assertions. In this work, we introduce "hinting," a data augmentation technique that enhances contextualized commonsense inference. "Hinting" employs a prefix prompting strategy using both hard and soft prompts to guide the inference process. To demonstrate its effectiveness, we apply "hinting" to two contextual commonsense inference datasets: ParaCOMET and GLUCOSE, evaluating its impact on both general and context-specific inference. Furthermore, we evaluate "hinting" by incorporating synonyms and antonyms into the hints. Our results show that "hinting" does not compromise the performance of contextual commonsense inference while offering improved controllability.
Related papers
- Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs [61.796960984541464]
We present COM2 (COMplex COMmonsense), a new dataset created by sampling logical queries.
We verbalize them using handcrafted rules and large language models into multiple-choice and text generation questions.
Experiments show that language models trained on COM2 exhibit significant improvements in complex reasoning ability.
arXiv Detail & Related papers (2024-03-12T08:13:52Z) - Enhancing Argument Structure Extraction with Efficient Leverage of
Contextual Information [79.06082391992545]
We propose an Efficient Context-aware model (ECASE) that fully exploits contextual information.
We introduce a sequence-attention module and distance-weighted similarity loss to aggregate contextual information and argumentative information.
Our experiments on five datasets from various domains demonstrate that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-08T08:47:10Z) - Self-Consistent Narrative Prompts on Abductive Natural Language
Inference [42.201304482932706]
Abduction has long been seen as crucial for narrative comprehension and reasoning about everyday situations.
We propose a prompt tuning model $alpha$-PACE, which takes self-consistency and inter-sentential coherence into consideration.
arXiv Detail & Related papers (2023-09-15T10:48:10Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Context-faithful Prompting for Large Language Models [51.194410884263135]
Large language models (LLMs) encode parametric knowledge about world facts.
Their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks.
We assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention.
arXiv Detail & Related papers (2023-03-20T17:54:58Z) - Adversarial Transformer Language Models for Contextual Commonsense
Inference [14.12019824666882]
Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions.
Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training.
We develop techniques to address the aforementioned problems in the task.
arXiv Detail & Related papers (2023-02-10T18:21:13Z) - Multiview Contextual Commonsense Inference: A New Dataset and Task [40.566530682082714]
CICEROv2 is a dataset consisting of 8,351 instances from 2,379 dialogues.
It contains multiple human-written answers for each contextual commonsense inference question.
We show that the inferences in CICEROv2 are more semantically diverse than other contextual commonsense inference datasets.
arXiv Detail & Related papers (2022-10-06T13:08:41Z) - Textual Explanations and Critiques in Recommendation Systems [8.406549970145846]
dissertation focuses on two fundamental challenges of addressing this need.
The first involves explanation generation in a scalable and data-driven manner.
The second challenge consists in making explanations actionable, and we refer to it as critiquing.
arXiv Detail & Related papers (2022-05-15T11:59:23Z) - CIS2: A Simplified Commonsense Inference Evaluation for Story Prose [21.32351425259654]
We look at the domain of commonsense reasoning within story prose, which we call contextual commonsense inference (CCI)
We introduce the task contextual commonsense inference in sentence selection (CIS$2$), a simplified task that avoids conflation by eliminating language generation altogether.
arXiv Detail & Related papers (2022-02-16T06:14:37Z) - Probing Task-Oriented Dialogue Representation from Language Models [106.02947285212132]
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way.
arXiv Detail & Related papers (2020-10-26T21:34:39Z) - Paragraph-level Commonsense Transformers with Recurrent Memory [77.4133779538797]
We train a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives.
Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.
arXiv Detail & Related papers (2020-10-04T05:24:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.