MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning
- URL: http://arxiv.org/abs/2405.05189v2
- Date: Sun, 2 Jun 2024 18:47:44 GMT
- Title: MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning
- Authors: Inderjeet Nair, Lu Wang,
- Abstract summary: We study the task of conducting structured reasoning as generating a reasoning graph from natural language input using large language models (LLMs)
Previous approaches have explored various prompting schemes, yet they suffer from error propagation due to the autoregressive nature and single-pass-based decoding.
We draw inspiration from self-consistency (SC), which involves sampling a diverse set of reasoning chains and taking the majority vote as the final answer.
- Score: 4.928535836750263
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We study the task of conducting structured reasoning as generating a reasoning graph from natural language input using large language models (LLMs). Previous approaches have explored various prompting schemes, yet they suffer from error propagation due to the autoregressive nature and single-pass-based decoding, which lack error correction capability. Additionally, relying solely on a single sample may result in the omission of true nodes and edges. To counter this, we draw inspiration from self-consistency (SC), which involves sampling a diverse set of reasoning chains and taking the majority vote as the final answer. To tackle the substantial challenge of applying SC on generated graphs, we propose MIDGARD (MInimum Description length Guided Aggregation of Reasoning in Directed acyclic graph) that leverages Minimum Description Length (MDL)-based formulation to identify consistent properties among the different graph samples generated by an LLM. This formulation helps reject properties that appear in only a few samples, which are likely to be erroneous, while enabling the inclusion of missing elements without compromising precision. Our method demonstrates superior performance than comparisons across various structured reasoning tasks, including argument structure extraction, explanation graph generation, inferring dependency relations among actions for everyday tasks, and semantic graph generation from natural texts.
Related papers
- Graph-DPEP: Decomposed Plug and Ensemble Play for Few-Shot Document Relation Extraction with Graph-of-Thoughts Reasoning [34.85741925091139]
Graph-DPEP framework is grounded in the reasoning behind triplet explanation thoughts presented in natural language.
We develop "ensemble-play", reapplying generation on the entire type list by leveraging the reasoning thoughts embedded in a sub-graph.
arXiv Detail & Related papers (2024-11-05T07:12:36Z) - Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration [0.0]
The paper proposes a threefold enhancement to the LiNGAM-SPP framework.
The need for parameter tuning is eliminated by using the pairwise likelihood ratio in lieu of kNN-based mutual information.
The incorporation of prior knowledge is then enabled by a node-skipping strategy implemented on the graph representation of all causal orderings.
arXiv Detail & Related papers (2024-04-18T05:59:28Z) - Can LLMs perform structured graph reasoning? [4.676784872259775]
Pretrained Large Language Models (LLMs) have demonstrated various reasoning capabilities through language-based prompts alone.
We design various graph reasoning tasks as a proxy to semi-structured tasks in this paper.
We benchmark 5 different instruct-finetuned LLMs (GPT-4, GPT-3.5, Claude-2, Llama-2 and Palm-2) on the aforementioned tasks.
arXiv Detail & Related papers (2024-02-02T09:45:33Z) - Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations.
We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z) - OrthoReg: Improving Graph-regularized MLPs via Orthogonality
Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data.
Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks.
We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space.
We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Learning compositional structures for semantic graph parsing [81.41592892863979]
We show how AM dependency parsing can be trained directly on a neural latent-variable model.
Our model picks up on several linguistic phenomena on its own and achieves comparable accuracy to supervised training.
arXiv Detail & Related papers (2021-06-08T14:20:07Z) - Graph Ensemble Learning over Multiple Dependency Trees for Aspect-level
Sentiment Classification [37.936820137442254]
We propose a simple yet effective graph ensemble technique, GraphMerge, to make use of the predictions from differ-ent relations.
Instead of assigning one set of model parameters to each dependency tree, we first combine the dependency from different parses before applying GNNs over the resulting graph.
Our experiments on the SemEval 2014 Task 4 and ACL 14 Twitter datasets show that our GraphMerge model not only outperforms models with single dependency tree, but also beats other ensemble mod-els without adding model parameters.
arXiv Detail & Related papers (2021-03-12T22:27:23Z) - A Differentiable Relaxation of Graph Segmentation and Alignment for AMR
Parsing [75.36126971685034]
We treat alignment and segmentation as latent variables in our model and induce them as part of end-to-end training.
Our method also approaches that of a model that relies on citetLyu2018AMRPA's segmentation rules, which were hand-crafted to handle individual AMR constructions.
arXiv Detail & Related papers (2020-10-23T21:22:50Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.