MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language
Models to Generalize to Novel Interpretations
- URL: http://arxiv.org/abs/2310.11634v1
- Date: Wed, 18 Oct 2023 00:02:38 GMT
- Title: MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language
Models to Generalize to Novel Interpretations
- Authors: Arkil Patel, Satwik Bhattamishra, Siva Reddy, Dzmitry Bahdanau
- Abstract summary: Humans possess a remarkable ability to assign novel interpretations to linguistic expressions.
Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly.
We systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning.
- Score: 37.13707912132472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans possess a remarkable ability to assign novel interpretations to
linguistic expressions, enabling them to learn new words and understand
community-specific connotations. However, Large Language Models (LLMs) have a
knowledge cutoff and are costly to finetune repeatedly. Therefore, it is
crucial for LLMs to learn novel interpretations in-context. In this paper, we
systematically analyse the ability of LLMs to acquire novel interpretations
using in-context learning. To facilitate our study, we introduce MAGNIFICo, an
evaluation suite implemented within a text-to-SQL semantic parsing framework
that incorporates diverse tokens and prompt settings to simulate real-world
complexity. Experimental results on MAGNIFICo demonstrate that LLMs exhibit a
surprisingly robust capacity for comprehending novel interpretations from
natural language descriptions as well as from discussions within long
conversations. Nevertheless, our findings also highlight the need for further
improvements, particularly when interpreting unfamiliar words or when composing
multiple novel interpretations simultaneously in the same example.
Additionally, our analysis uncovers the semantic predispositions in LLMs and
reveals the impact of recency bias for information presented in long contexts.
Related papers
- Investigating Expert-in-the-Loop LLM Discourse Patterns for Ancient Intertextual Analysis [0.0]
The study demonstrates that large language models can detect direct quotations, allusions, and echoes between texts.
The model struggles with long query passages and the inclusion of false intertextual dependences.
The expert-in-the-loop methodology presented offers a scalable approach for intertextual research.
arXiv Detail & Related papers (2024-09-03T13:23:11Z) - When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models [59.84769254832941]
We propose a FaLlacy Understanding Benchmark (FLUB) containing cunning texts that are easy for humans to understand but difficult for models to grasp.
Specifically, the cunning texts that FLUB focuses on mainly consist of the tricky, humorous, and misleading texts collected from the real internet environment.
Based on FLUB, we investigate the performance of multiple representative and advanced LLMs.
arXiv Detail & Related papers (2024-02-16T22:12:53Z) - Can Large Language Models Understand Context? [17.196362853457412]
This paper introduces a context understanding benchmark by adapting existing datasets to suit the evaluation of generative models.
Experimental results indicate that pre-trained dense models struggle with understanding more nuanced contextual features when compared to state-of-the-art fine-tuned models.
As LLM compression holds growing significance in both research and real-world applications, we assess the context understanding of quantized models under in-context-learning settings.
arXiv Detail & Related papers (2024-02-01T18:55:29Z) - Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks.
The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human.
These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z) - Sparsity-Guided Holistic Explanation for LLMs with Interpretable
Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications.
We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z) - IERL: Interpretable Ensemble Representation Learning -- Combining
CrowdSourced Knowledge and Distributed Semantic Representations [11.008412414253662]
Large Language Models (LLMs) encode meanings of words in the form of distributed semantics.
Recent studies have shown that LLMs tend to generate unintended, inconsistent, or wrong texts as outputs.
We propose a novel ensemble learning method, Interpretable Ensemble Representation Learning (IERL), that systematically combines LLM and crowdsourced knowledge representations.
arXiv Detail & Related papers (2023-06-24T05:02:34Z) - Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks.
We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z) - Large Language Models are In-Context Semantic Reasoners rather than
Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years.
Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear.
In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.