Related papers: Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors

Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors

URL: http://arxiv.org/abs/2404.17807v1
Date: Sat, 27 Apr 2024 07:06:39 GMT
Title: Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors
Authors: Guozheng Li, Peng Wang, Jiajun Liu, Yikai Guo, Ke Ji, Ziyu Shang, Zijie Xu,
Abstract summary: textscMicre (textbfMeta textbfIn-textbfContext learning of LLMs for textbfRelation textbfExtraction) is a new meta-training framework for zero and few-shot Relation extraction. We show that textscMicre can transfer the relation semantic knowledge via relation label name during inference on target RE datasets.
Score: 9.881102419679673
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Relation extraction (RE) is an important task that aims to identify the relationships between entities in texts. While large language models (LLMs) have revealed remarkable in-context learning (ICL) capability for general zero and few-shot learning, recent studies indicate that current LLMs still struggle with zero and few-shot RE. Previous studies are mainly dedicated to design prompt formats and select good examples for improving ICL-based RE. Although both factors are vital for ICL, if one can fundamentally boost the ICL capability of LLMs in RE, the zero and few-shot RE performance via ICL would be significantly improved. To this end, we introduce \textsc{Micre} (\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction), a new meta-training framework for zero and few-shot RE where an LLM is tuned to do ICL on a diverse collection of RE datasets (i.e., learning to learn in context for RE). Through meta-training, the model becomes more effectively to learn a new RE task in context by conditioning on a few training examples with no parameter updates or task-specific templates at inference time, enabling better zero and few-shot task generalization. We experiment \textsc{Micre} on various LLMs with different model scales and 12 public RE datasets, and then evaluate it on unseen RE benchmarks under zero and few-shot settings. \textsc{Micre} delivers comparable or superior performance compared to a range of baselines including supervised fine-tuning and typical in-context learning methods. We find that the gains are particular significant for larger model scales, and using a diverse set of the meta-training RE datasets is key to improvements. Empirically, we show that \textsc{Micre} can transfer the relation semantic knowledge via relation label name during inference on target RE datasets.

Related papers

CoT Referring: Improving Referring Expression Tasks with Grounded Reasoning [67.18702329644526]
CoT Referring enhances model reasoning across modalities through a structured, chain-of-thought training data structure.<n>We restructure the training data to enforce a new output form, providing new annotations for existing datasets.<n>We also integrate detection and segmentation capabilities into a unified MLLM framework, training it with a novel adaptive weighted loss to optimize performance.
arXiv Detail & Related papers (2025-10-03T08:50:21Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives [51.497338578427915]
Large language models (LLMs) excel at few-shot in-context learning (ICL) without requiring parameter updates.<n>DrICL is a novel optimization method that enhances model performance through textitDifferentiated and textitReweighting objectives.<n>We develop the textitMany-Shot ICL Benchmark (ICL-50)-a large-scale benchmark of 50 tasks that cover shot numbers from 1 to 350 within sequences of up to 8,000 tokens.
arXiv Detail & Related papers (2025-01-07T14:57:08Z)
ICLERB: In-Context Learning Embedding and Reranker Benchmark [45.40331863265474]
In-Context Learning (ICL) enables Large Language Models to perform new tasks by conditioning on prompts with relevant information. Traditional retrieval methods focus on semantic relevance, treating retrieval as a search problem. We propose reframing retrieval for ICL as a recommendation problem, aiming to select documents that maximize utility in ICL tasks.
arXiv Detail & Related papers (2024-11-28T06:28:45Z)
Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting [21.04933334040135]
We introduce the Self-Prompting framework, a novel method designed to fully harness the embedded RE knowledge within Large Language Models.<n>Our framework employs a three-stage diversity approach to prompt LLMs, generating multiple synthetic samples that encapsulate specific relations from scratch.<n> Experimental evaluations on benchmark datasets show our approach outperforms existing LLM-based zero-shot RE methods.
arXiv Detail & Related papers (2024-10-02T01:12:54Z)
Empowering Few-Shot Relation Extraction with The Integration of Traditional RE Methods and Large Language Models [48.846159555253834]
Few-Shot Relation Extraction (FSRE) appeals to more researchers in Natural Language Processing (NLP) Recent emergence of Large Language Models (LLMs) has prompted numerous researchers to explore FSRE through In-Context Learning (ICL)
arXiv Detail & Related papers (2024-07-12T03:31:11Z)
Relation Extraction with Fine-Tuned Large Language Models in Retrieval Augmented Generation Frameworks [0.0]
Relation Extraction (RE) is crucial for converting unstructured data into structured formats like Knowledge Graphs (KGs) Recent studies leveraging pre-trained language models (PLMs) have shown significant success in this area. This work explores the performance of fine-tuned LLMs and their integration into the Retrieval Augmented-based (RAG) RE approach.
arXiv Detail & Related papers (2024-06-20T21:27:57Z)
How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation [7.151108031568037]
This paper constructs low-resource relation extraction datasets in 10 low-resource languages (LRLs) in three regions (Central Asia, Southeast Asia and Middle East) The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation. Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets.
arXiv Detail & Related papers (2024-06-17T03:02:04Z)
RaFe: Ranking Feedback Improves Query Rewriting for RAG [83.24385658573198]
We propose a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, oursprovides feedback aligned well with the rewriting objectives.
arXiv Detail & Related papers (2024-05-23T11:00:19Z)
Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction [11.535892987373947]
Relation extraction (RE) aims to identify relations between entities mentioned in texts. Large language models (LLMs) have demonstrated impressive in-context learning abilities in various tasks. LLMs suffer from poor performances compared to most supervised fine-tuned RE methods.
arXiv Detail & Related papers (2024-04-27T07:12:52Z)
Many-Shot In-Context Learning [58.395589302800566]
Large language models (LLMs) excel at few-shot in-context learning (ICL) We observe significant performance gains across a wide variety of generative and discriminative tasks. Unlike few-shot learning, many-shot learning is effective at overriding pretraining biases.
arXiv Detail & Related papers (2024-04-17T02:49:26Z)
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation [128.01050030936028]
We propose an information refinement training method named InFO-RAG. InFO-RAG is low-cost and general across various tasks. It improves the performance of LLaMA2 by an average of 9.39% relative points.
arXiv Detail & Related papers (2024-02-28T08:24:38Z)
Revisiting Large Language Models as Zero-shot Relation Extractors [8.953462875381888]
Relation extraction (RE) consistently involves a certain degree of labeled or unlabeled data even if under zero-shot setting. Recent studies have shown that large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt. This work focuses on the study of exploring LLMs as zero-shot relation extractors.
arXiv Detail & Related papers (2023-10-08T06:17:39Z)
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z)
Explaining Emergent In-Context Learning as Kernel Regression [61.57151500616111]
Large language models (LLMs) have initiated a paradigm shift in transfer learning. In this paper, we investigate the reason why a transformer-based language model can accomplish in-context learning after pre-training. We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression.
arXiv Detail & Related papers (2023-05-22T06:45:02Z)
Synergistic Interplay between Search and Large Language Models for Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections. InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)
Learning from Context or Names? An Empirical Study on Neural Relation Extraction [112.06614505580501]
We study the effect of two main information sources in text: textual context and entity mentions (names) We propose an entity-masked contrastive pre-training framework for relation extraction (RE) Our framework can improve the effectiveness and robustness of neural models in different RE scenarios.
arXiv Detail & Related papers (2020-10-05T11:21:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.