Related papers: MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning

MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning

URL: http://arxiv.org/abs/2405.05189v2
Date: Sun, 2 Jun 2024 18:47:44 GMT
Title: MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning
Authors: Inderjeet Nair, Lu Wang,
Abstract summary: We study the task of conducting structured reasoning as generating a reasoning graph from natural language input using large language models (LLMs) Previous approaches have explored various prompting schemes, yet they suffer from error propagation due to the autoregressive nature and single-pass-based decoding. We draw inspiration from self-consistency (SC), which involves sampling a diverse set of reasoning chains and taking the majority vote as the final answer.
Score: 4.928535836750263
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We study the task of conducting structured reasoning as generating a reasoning graph from natural language input using large language models (LLMs). Previous approaches have explored various prompting schemes, yet they suffer from error propagation due to the autoregressive nature and single-pass-based decoding, which lack error correction capability. Additionally, relying solely on a single sample may result in the omission of true nodes and edges. To counter this, we draw inspiration from self-consistency (SC), which involves sampling a diverse set of reasoning chains and taking the majority vote as the final answer. To tackle the substantial challenge of applying SC on generated graphs, we propose MIDGARD (MInimum Description length Guided Aggregation of Reasoning in Directed acyclic graph) that leverages Minimum Description Length (MDL)-based formulation to identify consistent properties among the different graph samples generated by an LLM. This formulation helps reject properties that appear in only a few samples, which are likely to be erroneous, while enabling the inclusion of missing elements without compromising precision. Our method demonstrates superior performance than comparisons across various structured reasoning tasks, including argument structure extraction, explanation graph generation, inferring dependency relations among actions for everyday tasks, and semantic graph generation from natural texts.

Related papers

Leveraging Joint Predictive Embedding and Bayesian Inference in Graph Self Supervised Learning [0.0]
Graph representation learning has emerged as a cornerstone for tasks like node classification and link prediction. Current self-supervised learning (SSL) methods face challenges such as computational inefficiency, reliance on contrastive objectives, and representation collapse. We propose a novel joint embedding predictive framework for graph SSL that eliminates contrastive objectives and negative sampling while preserving semantic and structural information.
arXiv Detail & Related papers (2025-02-02T07:42:45Z)
Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models [62.12031550252253]
We present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning. PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers.
arXiv Detail & Related papers (2024-12-23T20:27:12Z)
Guide-to-Explain for Controllable Summarization [11.904090197598505]
controllable summarization with large language models (LLMs) remains underexplored. We propose a guide-to-explain framework (GTE) for controllable summarization. Our framework enables the model to identify misaligned attributes in the initial draft and guides it in explaining errors in the previous output.
arXiv Detail & Related papers (2024-11-19T12:36:02Z)
Graph-DPEP: Decomposed Plug and Ensemble Play for Few-Shot Document Relation Extraction with Graph-of-Thoughts Reasoning [34.85741925091139]
Graph-DPEP framework is grounded in the reasoning behind triplet explanation thoughts presented in natural language. We develop "ensemble-play", reapplying generation on the entire type list by leveraging the reasoning thoughts embedded in a sub-graph.
arXiv Detail & Related papers (2024-11-05T07:12:36Z)
Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration [0.0]
The paper proposes a threefold enhancement to the LiNGAM-SPP framework. The need for parameter tuning is eliminated by using the pairwise likelihood ratio in lieu of kNN-based mutual information. The incorporation of prior knowledge is then enabled by a node-skipping strategy implemented on the graph representation of all causal orderings.
arXiv Detail & Related papers (2024-04-18T05:59:28Z)
Can LLMs perform structured graph reasoning? [4.676784872259775]
Pretrained Large Language Models (LLMs) have demonstrated various reasoning capabilities through language-based prompts alone. We design various graph reasoning tasks as a proxy to semi-structured tasks in this paper. We benchmark 5 different instruct-finetuned LLMs (GPT-4, GPT-3.5, Claude-2, Llama-2 and Palm-2) on the aforementioned tasks.
arXiv Detail & Related papers (2024-02-02T09:45:33Z)
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations. We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z)
OrthoReg: Improving Graph-regularized MLPs via Orthogonality Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data. Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks. We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space. We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z)
HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization. Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z)
Learning compositional structures for semantic graph parsing [81.41592892863979]
We show how AM dependency parsing can be trained directly on a neural latent-variable model. Our model picks up on several linguistic phenomena on its own and achieves comparable accuracy to supervised training.
arXiv Detail & Related papers (2021-06-08T14:20:07Z)
A Differentiable Relaxation of Graph Segmentation and Alignment for AMR Parsing [75.36126971685034]
We treat alignment and segmentation as latent variables in our model and induce them as part of end-to-end training. Our method also approaches that of a model that relies on citetLyu2018AMRPA's segmentation rules, which were hand-crafted to handle individual AMR constructions.
arXiv Detail & Related papers (2020-10-23T21:22:50Z)
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks. We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task. Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.