Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks
- URL: http://arxiv.org/abs/2409.06173v3
- Date: Thu, 17 Oct 2024 17:02:02 GMT
- Title: Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks
- Authors: Georgios Chochlakis, Niyantha Maruthu Pandiyan, Kristina Lerman, Shrikanth Narayanan,
- Abstract summary: In-Context Learning (ICL) in Large Language Models (LLM) has emerged as the dominant technique for performing natural language tasks.
We show that ICL relies mostly on the retrieval of task priors and less so on "learning" to perform tasks.
We find that, surprisingly, Chain-of-Thought (CoT) indeed suffers from the same posterior collapse as ICL for larger language models.
- Score: 25.562937159039038
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-Context Learning (ICL) in Large Language Models (LLM) has emerged as the dominant technique for performing natural language tasks, as it does not require updating the model parameters with gradient-based methods. ICL promises to "adapt" the LLM to perform the present task at a competitive or state-of-the-art level at a fraction of the computational cost. ICL can be augmented by incorporating the reasoning process to arrive at the final label explicitly in the prompt, a technique called Chain-of-Thought (CoT) prompting. However, recent work has found that ICL relies mostly on the retrieval of task priors and less so on "learning" to perform tasks, especially for complex subjective domains like emotion and morality, where priors ossify posterior predictions. In this work, we examine whether "enabling" reasoning also creates the same behavior in LLMs, wherein the format of CoT retrieves reasoning priors that remain relatively unchanged despite the evidence in the prompt. We find that, surprisingly, CoT indeed suffers from the same posterior collapse as ICL for larger language models. Code is avalaible at https://github.com/gchochla/cot-priors.
Related papers
- Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors [74.04775677110179]
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs)
In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt.
Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead.
arXiv Detail & Related papers (2024-10-17T17:16:00Z) - The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM)
We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions.
Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z) - Can Separators Improve Chain-of-Thought Prompting? [10.398343318429367]
Chain-of-thought (CoT) prompting is a simple and effective method for improving the reasoning capabilities of Large Language Models (LLMs)
Inspired by human cognition, we introduce COT-SEP, a method that strategically employs separators at the end of each exemplar in CoT prompting.
arXiv Detail & Related papers (2024-02-16T12:46:16Z) - Chain-of-Thought Reasoning Without Prompting [40.92854235219315]
CoT reasoning paths can be elicited from pre-trained language models by simply altering the textitdecoding process.
The presence of a CoT in the decoding path correlates with a higher confidence in the model's decoded answer.
arXiv Detail & Related papers (2024-02-15T18:55:41Z) - LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning [61.7853049843921]
Chain-of-thought (CoT) prompting is a popular in-context learning approach for large language models (LLMs)
This paper introduces a new approach named Latent Reasoning Skills (LaRS) that employs unsupervised learning to create a latent space representation of rationales.
arXiv Detail & Related papers (2023-12-07T20:36:10Z) - CLadder: Assessing Causal Reasoning in Language Models [82.8719238178569]
We investigate whether large language models (LLMs) can coherently reason about causality.
We propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al.
arXiv Detail & Related papers (2023-12-07T15:12:12Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - Exploring Continual Learning for Code Generation Models [80.78036093054855]
Continual Learning (CL) is an important aspect that remains underexplored in the code domain.
We introduce a benchmark called CodeTask-CL that covers a wide range of tasks, including code generation, translation, summarization, and refinement.
We find that effective methods like Prompt Pooling (PP) suffer from catastrophic forgetting due to the unstable training of the prompt selection mechanism.
arXiv Detail & Related papers (2023-07-05T16:58:39Z) - Explaining Emergent In-Context Learning as Kernel Regression [61.57151500616111]
Large language models (LLMs) have initiated a paradigm shift in transfer learning.
In this paper, we investigate the reason why a transformer-based language model can accomplish in-context learning after pre-training.
We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression.
arXiv Detail & Related papers (2023-05-22T06:45:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.