Related papers: Self-supervised Analogical Learning using Language Models

Self-supervised Analogical Learning using Language Models

URL: http://arxiv.org/abs/2502.00996v1
Date: Mon, 03 Feb 2025 02:31:26 GMT
Title: Self-supervised Analogical Learning using Language Models
Authors: Ben Zhou, Sarthak Jain, Yi Zhang, Qiang Ning, Shuai Wang, Yassine Benajiba, Dan Roth,
Abstract summary: We propose SAL, a self-supervised analogical learning framework.<n> SAL mimics the human analogy process and trains models to explicitly transfer high-quality symbolic solutions.<n>We show that the resulting models outperform base language models on a wide range of reasoning benchmarks.
Score: 59.64260218737556
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models have been shown to suffer from reasoning inconsistency issues. That is, they fail more in situations unfamiliar to the training data, even though exact or very similar reasoning paths exist in more common cases that they can successfully solve. Such observations motivate us to propose methods that encourage models to understand the high-level and abstract reasoning processes during training instead of only the final answer. This way, models can transfer the exact solution to similar cases, regardless of their relevance to the pre-training data distribution. In this work, we propose SAL, a self-supervised analogical learning framework. SAL mimics the human analogy process and trains models to explicitly transfer high-quality symbolic solutions from cases that they know how to solve to other rare cases in which they tend to fail more. We show that the resulting models after SAL learning outperform base language models on a wide range of reasoning benchmarks, such as StrategyQA, GSM8K, and HotpotQA, by 2% to 20%. At the same time, we show that our model is more generalizable and controllable through analytical studies.

Related papers

Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks [24.55929874173401]
We show that a language model's reasoning capabilities can be improved by training on datasets of chain-of-thought traces from more capable models.<n>Experiments show this approach can yield better performance on reasoning tasks than training on human-annotated datasets.
arXiv Detail & Related papers (2025-12-24T07:35:55Z)
Reasoning with Sampling: Your Base Model is Smarter Than You Think [52.639108524651846]
We propose a simple iterative sampling algorithm leveraging the base models' own likelihoods.<n>We show that our algorithm offers substantial boosts in reasoning that nearly match and even outperform those from RL.<n>Our method does not require training, curated datasets, or a verifier.
arXiv Detail & Related papers (2025-10-16T17:18:11Z)
Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning [42.089912289949154]
This paper presents Self-Error-Instruct (SEI), a framework that addresses model weaknesses and synthesizes more generalized targeted training data.<n>Specifically, we explore a target model on two mathematical datasets, GSM8K and MATH, to pinpoint bad cases.<n>Next, we sample a few bad cases during each generation for each identified error type and input them into the instructor model, which synthesizes additional training data.
arXiv Detail & Related papers (2025-05-28T17:02:47Z)
Implicit Bias-Like Patterns in Reasoning Models [0.0]
Implicit bias refers to automatic or spontaneous mental processes that shape perceptions, judgments, and behaviors. We present a method called the Reasoning Model Implicit Association Test (RM-IAT) for studying implicit bias-like patterns in reasoning models.
arXiv Detail & Related papers (2025-03-14T16:40:02Z)
What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy. By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z)
Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models [7.779982757267302]
We investigate the generality of analogy-making abilities previously claimed for large language models (LLMs) We show that while the performance of humans remains high for all the problems, the GPT models' performance declines sharply on the counterfactual set.
arXiv Detail & Related papers (2024-02-14T05:52:23Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains. In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Understanding Text Classification Data and Models Using Aggregated Input Salience [2.105564340986074]
In some cases, an input salience method, which highlights the most important parts of the input, may reveal problematic reasoning. In this paper we aim to address these issues and go from understanding single examples to understanding entire datasets and models. Using this methodology we address multiple distinct but common model developer needs by showing how problematic data and model behavior can be identified.
arXiv Detail & Related papers (2022-11-10T11:00:57Z)
Uncertainty Estimation for Language Reward Models [5.33024001730262]
Language models can learn a range of capabilities from unsupervised training on text corpora. It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons. We seek to address these problems via uncertainty estimation, which can improve sample efficiency and robustness using active learning and risk-averse reinforcement learning.
arXiv Detail & Related papers (2022-03-14T20:13:21Z)
Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models [19.21465581259624]
Many problems can be attributed to models exploiting spurious correlations, or shortcuts between the training data and the task labels. In this paper, we aim to automatically identify such spurious correlations in NLP models at scale. We show that our proposed method can effectively and efficiently identify a scalable set of "shortcuts", and mitigating these leads to more robust models in multiple applications.
arXiv Detail & Related papers (2021-10-14T21:40:03Z)
Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models [62.28551903638434]
We measure the impact of three different adaptation methods on the generalization and accuracy of models. Experiments with two models show that fine-tuning performs best, by learning both the content and the structure of the task, but suffers from overfitting and limited generalization to novel answers. We observe that alternative adaptation methods like prefix-tuning have comparable accuracy, but generalize better to unseen answers and are more robust to adversarial splits.
arXiv Detail & Related papers (2021-09-07T03:13:06Z)
Paired Examples as Indirect Supervision in Latent Decision Models [109.76417071249945]
We introduce a way to leverage paired examples that provide stronger cues for learning latent decisions. We apply our method to improve compositional question answering using neural module networks on the DROP dataset.
arXiv Detail & Related papers (2021-04-05T03:58:30Z)
CausaLM: Causal Model Explanation Through Counterfactual Language Models [33.29636213961804]
CausaLM is a framework for producing causal model explanations using counterfactual language representation models. We show that language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest. A byproduct of our method is a language representation model that is unaffected by the tested concept.
arXiv Detail & Related papers (2020-05-27T15:06:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.