Shortcut Learning in In-Context Learning: A Survey
- URL: http://arxiv.org/abs/2411.02018v1
- Date: Mon, 04 Nov 2024 12:13:04 GMT
- Title: Shortcut Learning in In-Context Learning: A Survey
- Authors: Rui Song, Yingji Li, Fausto Giunchiglia, Hao Xu,
- Abstract summary: Shortcut learning refers to the phenomenon where models employ simple, non-robust decision rules in practical tasks.
This paper provides a novel perspective to review relevant research on shortcut learning in In-Context Learning (ICL)
- Score: 17.19214732926589
- License:
- Abstract: Shortcut learning refers to the phenomenon where models employ simple, non-robust decision rules in practical tasks, which hinders their generalization and robustness. With the rapid development of large language models (LLMs) in recent years, an increasing number of studies have shown the impact of shortcut learning on LLMs. This paper provides a novel perspective to review relevant research on shortcut learning in In-Context Learning (ICL). It conducts a detailed exploration of the types of shortcuts in ICL tasks, their causes, available benchmarks, and strategies for mitigating shortcuts. Based on corresponding observations, it summarizes the unresolved issues in existing research and attempts to outline the future research landscape of shortcut learning.
Related papers
- Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut
Learning in Text Classification by Language Models [20.70050968223901]
This study addresses the overlooked impact of subtler, more complex shortcuts that compromise model reliability beyond oversimplified shortcuts.
We introduce a comprehensive benchmark that categorizes shortcuts into occurrence, style, and concept.
Our research systematically investigates models' resilience and susceptibilities to sophisticated shortcuts.
arXiv Detail & Related papers (2024-09-26T01:17:42Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Learning Shortcuts: On the Misleading Promise of NLU in Language Models [4.8951183832371]
Large language models (LLMs) have enabled significant performance gains in the field of natural language processing.
Recent studies have found that LLMs often resort to shortcuts when performing tasks, creating an illusion of enhanced performance while lacking generalizability in their decision rules.
arXiv Detail & Related papers (2024-01-17T21:55:15Z) - Large Language Models Can be Lazy Learners: Analyze Shortcuts in
In-Context Learning [28.162661418161466]
Large language models (LLMs) have recently shown great potential for in-context learning.
This paper investigates the reliance of LLMs on shortcuts or spurious correlations within prompts.
We uncover a surprising finding that larger models are more likely to utilize shortcuts in prompts during inference.
arXiv Detail & Related papers (2023-05-26T20:56:30Z) - A Survey on In-context Learning [77.78614055956365]
In-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP)
We first present a formal definition of ICL and clarify its correlation to related studies.
We then organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis.
arXiv Detail & Related papers (2022-12-31T15:57:09Z) - A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine
Reading Comprehension [34.400234717524306]
We focus on the field of machine reading comprehension (MRC), an important task for showcasing high-level language understanding.
We highlight two concerns for shortcut mitigation in MRC: (1) the lack of public challenge sets, a necessary component for effective and reusable evaluation, and (2) the lack of certain mitigation techniques that are prominent in other areas.
arXiv Detail & Related papers (2022-09-05T08:19:26Z) - Shortcut Learning of Large Language Models in Natural Language
Understanding [119.45683008451698]
Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks.
They might rely on dataset bias and artifacts as shortcuts for prediction.
This has significantly affected their generalizability and adversarial robustness.
arXiv Detail & Related papers (2022-08-25T03:51:39Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - Why Machine Reading Comprehension Models Learn Shortcuts? [56.629192589376046]
We argue that larger proportion of shortcut questions in training data make models rely on shortcut tricks excessively.
A thorough empirical analysis shows that MRC models tend to learn shortcut questions earlier than challenging questions.
arXiv Detail & Related papers (2021-06-02T08:43:12Z) - Curriculum Learning for Reinforcement Learning Domains: A Framework and
Survey [53.73359052511171]
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.
We present a framework for curriculum learning (CL) in RL, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals.
arXiv Detail & Related papers (2020-03-10T20:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.