Related papers: Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation

Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation

URL: http://arxiv.org/abs/2404.03196v1
Date: Thu, 4 Apr 2024 04:49:46 GMT
Title: Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation
Authors: Abhijnan Nath, Shadi Manafi, Avyakta Chelle, Nikhil Krishnaswamy,
Abstract summary: Event Coreference Resolution (ECR) is the task of connecting event clusters that refer to the same underlying real-life event. In this work, we investigate using abductive free-text rationales (FTRs) generated by modern autoregressive LLMs. We implement novel rationale-oriented event clustering and knowledge distillation methods for event coreference scoring.
Score: 6.102274021710727
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In NLP, Event Coreference Resolution (ECR) is the task of connecting event clusters that refer to the same underlying real-life event, usually via neural systems. In this work, we investigate using abductive free-text rationales (FTRs) generated by modern autoregressive LLMs as distant supervision of smaller student models for cross-document coreference (CDCR) of events. We implement novel rationale-oriented event clustering and knowledge distillation methods for event coreference scoring that leverage enriched information from the FTRs for improved CDCR without additional annotation or expensive document clustering. Our model using coreference specific knowledge distillation achieves SOTA B3 F1 on the ECB+ and GVC corpora and we establish a new baseline on the AIDA Phase 1 corpus. Our code can be found at https://github.com/csu-signal/llama_cdcr

Related papers

ESTR-CoT: Towards Explainable and Accurate Event Stream based Scene Text Recognition with Chain-of-Thought Reasoning [57.767536707234036]
We propose a novel chain-of-thought reasoning based event stream scene text recognition framework, termed ESTR-CoT.<n>Specifically, we first adopt the vision encoder EVA-CLIP to transform the input event stream into tokens and utilize a Llama tokenizer to encode the given generation prompt.<n>A Q-former is used to align the vision token to the pre-trained large language model Vicuna-7B and output both the answer and chain-of-thought (CoT) reasoning process simultaneously.
arXiv Detail & Related papers (2025-07-02T23:41:31Z)
Linear Cross-document Event Coreference Resolution with X-AMR [6.225801514919498]
Event Coreference Resolution (ECR) is expensive for automated systems and manual annotations. We propose a graphical representation of events, X-AMR, anchored around individual mentions. We then linearize the ECR with a novel multi-hop coreference algorithm over the event graphs.
arXiv Detail & Related papers (2024-03-25T02:49:06Z)
Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction [43.50683283748675]
Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. Recent advanced Large Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities.
arXiv Detail & Related papers (2024-01-24T17:04:28Z)
CorefPrompt: Prompt-based Event Coreference Resolution by Measuring Event Type and Argument Compatibilities [16.888201607072318]
Event coreference resolution (ECR) aims to group event mentions referring to the same real-world event into clusters. We propose a prompt-based approach, CorefPrompt, to transform ECR into a cloze-style (masked language model) task. This allows for simultaneous event modeling and coreference discrimination within a single template, with a fully shared context.
arXiv Detail & Related papers (2023-10-23T02:47:27Z)
Filling in the Gaps: Efficient Event Coreference Resolution using Graph Autoencoder Networks [0.0]
We introduce a novel and efficient method for Event Coreference Resolution (ECR) applied to a lower-resourced language domain. By framing ECR as a graph reconstruction task, we are able to combine deep semantic embeddings with structural coreference chain knowledge. Our method significantly outperforms classical mention-pair methods on a large Dutch event coreference corpus.
arXiv Detail & Related papers (2023-10-18T13:44:58Z)
An AMR-based Link Prediction Approach for Document-level Event Argument Extraction [51.77733454436013]
Recent works have introduced Abstract Meaning Representation (AMR) for Document-level Event Argument Extraction (Doc-level EAE) This work reformulates EAE as a link prediction problem on AMR graphs. We propose a novel graph structure, Tailored AMR Graph (TAG), which compresses less informative subgraphs and edge types, integrates span information, and highlights surrounding events in the same document.
arXiv Detail & Related papers (2023-05-30T16:07:48Z)
Going beyond research datasets: Novel intent discovery in the industry setting [60.90117614762879]
This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform. We show the benefit of pre-training language models on in-domain data: both self-supervised and with weak supervision. We also devise the best method to utilize the conversational structure (i.e., question and answer) of real-life datasets during fine-tuning for clustering tasks, which we call Conv.
arXiv Detail & Related papers (2023-05-09T14:21:29Z)
The CLEAR Benchmark: Continual LEArning on Real-World Imagery [77.98377088698984]
Continual learning (CL) is widely regarded as crucial challenge for lifelong AI. We introduce CLEAR, the first continual image classification benchmark dataset with a natural temporal evolution of visual concepts. We find that a simple unsupervised pre-training step can already boost state-of-the-art CL algorithms.
arXiv Detail & Related papers (2022-01-17T09:09:09Z)
Unsupervised Representation Learning via Neural Activation Coding [66.65837512531729]
We present neural activation coding (NAC) as a novel approach for learning deep representations from unlabeled data for downstream applications. We show that NAC learns both continuous and discrete representations of data, which we respectively evaluate on two downstream tasks.
arXiv Detail & Related papers (2021-12-07T21:59:45Z)
Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference [22.497877069528087]
Event and entity coreference resolution across documents vastly increases the number of candidate mentions, making it intractable to do the full $n2$ pairwise comparisons. Existing approaches simplify by considering coreference only within document clusters, but this fails to handle inter-cluster coreference. We draw on an insight from discourse coherence theory: potential coreferences are constrained by the reader's discourse focus. Our approach achieves state-of-the-art results for both events and entities on the ECB+, Gun Violence, Football Coreference, and Cross-Domain Cross-Document Coreference corpora.
arXiv Detail & Related papers (2021-10-11T15:41:47Z)
Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora [63.429307282665704]
Cross-document event coreference resolution (CDCR) is an NLP task in which mentions of events need to be identified and clustered throughout a collection of documents. CDCR aims to benefit downstream multi-document applications, but improvements from applying CDCR have not been shown yet. We make the observation that every CDCR system to date was developed, trained, and tested only on a single respective corpus.
arXiv Detail & Related papers (2020-11-24T17:45:03Z)
Searching Central Difference Convolutional Networks for Face Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems. Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network. Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.