Evaluation of Unsupervised Entity and Event Salience Estimation
- URL: http://arxiv.org/abs/2104.06924v1
- Date: Wed, 14 Apr 2021 15:23:08 GMT
- Title: Evaluation of Unsupervised Entity and Event Salience Estimation
- Authors: Jiaying Lu, Jinho D. Choi
- Abstract summary: Salience Estimation aims to predict term importance in documents.
Previous studies typically generate pseudo-ground truth for evaluation.
In this work, we propose a light yet practical entity and event salience estimation evaluation protocol.
- Score: 17.74208462902158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Salience Estimation aims to predict term importance in documents. Due to few
existing human-annotated datasets and the subjective notion of salience,
previous studies typically generate pseudo-ground truth for evaluation.
However, our investigation reveals that the evaluation protocol proposed by
prior work is difficult to replicate, thus leading to few follow-up studies
existing. Moreover, the evaluation process is problematic: the entity linking
tool used for entity matching is very noisy, while the ignorance of event
argument for event evaluation leads to boosted performance. In this work, we
propose a light yet practical entity and event salience estimation evaluation
protocol, which incorporates the more reliable syntactic dependency parser.
Furthermore, we conduct a comprehensive analysis among popular entity and event
definition standards, and present our own definition for the Salience
Estimation task to reduce noise during the pseudo-ground truth generation
process. Furthermore, we construct dependency-based heterogeneous graphs to
capture the interactions of entities and events. The empirical results show
that both baseline methods and the novel GNN method utilizing the heterogeneous
graph consistently outperform the previous SOTA model in all proposed metrics.
Related papers
- Rethinking Affect Analysis: A Protocol for Ensuring Fairness and Consistency [24.737468736951374]
We propose a unified protocol for database partitioning that ensures fairness and comparability.
We provide detailed demographic annotations (in terms of race, gender and age), evaluation metrics, and a common framework for expression recognition.
We also rerun the methods with the new protocol and introduce a new leaderboards to encourage future research in affect recognition with a fairer comparison.
arXiv Detail & Related papers (2024-08-04T23:21:46Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - SocREval: Large Language Models with the Socratic Method for Reference-Free Reasoning Evaluation [78.23119125463964]
We develop SocREval, a novel approach for prompt design in reference-free reasoning evaluation.
SocREval significantly improves GPT-4's performance, surpassing existing reference-free and reference-based reasoning evaluation metrics.
arXiv Detail & Related papers (2023-09-29T18:25:46Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Evaluating Causal Inference Methods [0.4588028371034407]
We introduce a deep generative model-based framework, Credence, to validate causal inference methods.
Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods.
arXiv Detail & Related papers (2022-02-09T00:21:22Z) - Stateful Offline Contextual Policy Evaluation and Learning [88.9134799076718]
We study off-policy evaluation and learning from sequential data.
We formalize the relevant causal structure of problems such as dynamic personalized pricing.
We show improved out-of-sample policy performance in this class of relevant problems.
arXiv Detail & Related papers (2021-10-19T16:15:56Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - A Critical Assessment of State-of-the-Art in Entity Alignment [1.7725414095035827]
We investigate two state-of-the-art (SotA) methods for the task of Entity Alignment in Knowledge Graphs.
We first carefully examine the benchmarking process and identify several shortcomings, which make the results reported in the original works not always comparable.
arXiv Detail & Related papers (2020-10-30T15:09:19Z) - Aligning Intraobserver Agreement by Transitivity [1.0152838128195467]
We propose a novel method for measuring within annotator consistency or annotator Intraobserver Agreement (IA)
The proposed approach is based on transitivity, a measure that has been thoroughly studied in the context of rational decision-making.
arXiv Detail & Related papers (2020-09-29T09:55:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.