Related papers: SMATCH++: Standardized and Extended Evaluation of Semantic Graphs

SMATCH++: Standardized and Extended Evaluation of Semantic Graphs

URL: http://arxiv.org/abs/2305.06993v1
Date: Thu, 11 May 2023 17:29:47 GMT
Title: SMATCH++: Standardized and Extended Evaluation of Semantic Graphs
Authors: Juri Opitz
Abstract summary: The Smatch metric is a popular method for evaluating graph distances. We show how to fully conform to annotation guidelines that allow structurally deviating but valid graphs. For improved scoring, we propose standardized and extended metric calculation of fine-grained sub-graph meaning aspects.
Score: 4.987581730476023
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Smatch metric is a popular method for evaluating graph distances, as is necessary, for instance, to assess the performance of semantic graph parsing systems. However, we observe some issues in the metric that jeopardize meaningful evaluation. E.g., opaque pre-processing choices can affect results, and current graph-alignment solvers do not provide us with upper-bounds. Without upper-bounds, however, fair evaluation is not guaranteed. Furthermore, adaptions of Smatch for extended tasks (e.g., fine-grained semantic similarity) are spread out, and lack a unifying framework. For better inspection, we divide the metric into three modules: pre-processing, alignment, and scoring. Examining each module, we specify its goals and diagnose potential issues, for which we discuss and test mitigation strategies. For pre-processing, we show how to fully conform to annotation guidelines that allow structurally deviating but valid graphs. For safer and enhanced alignment, we show the feasibility of optimal alignment in a standard evaluation setup, and develop a lossless graph compression method that shrinks the search space and significantly increases efficiency. For improved scoring, we propose standardized and extended metric calculation of fine-grained sub-graph meaning aspects. Our code is available at https://github.com/flipz357/smatchpp

Related papers

Graph Anomaly Detection with Noisy Labels by Reinforcement Learning [13.135788402192215]
We propose a novel framework REGAD, i.e., REinforced Graph Anomaly Detector. Specifically, we aim to maximize the performance improvement (AUC) of a base detector by cutting noisy edges approximated through the nodes with high-confidence labels.
arXiv Detail & Related papers (2024-07-08T13:41:21Z)
Cobra Effect in Reference-Free Image Captioning Metrics [58.438648377314436]
A proliferation of reference-free methods, leveraging visual-language pre-trained models (VLMs), has emerged. In this paper, we study if there are any deficiencies in reference-free metrics. We employ GPT-4V as an evaluative tool to assess generated sentences and the result reveals that our approach achieves state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2024-02-18T12:36:23Z)
Invariant Graph Transformer [0.0]
In graph machine learning context, graph rationalization can enhance the model performance. A key technique named "intervention" is applied to ensure the discriminative power of the extracted rationale subgraphs. In this paper, we propose well-tailored intervention strategies on graph data.
arXiv Detail & Related papers (2023-12-13T02:56:26Z)
Revisiting Evaluation Metrics for Semantic Segmentation: Optimization and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics. These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing. Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z)
Toward Falsifying Causal Graphs Using a Permutation-Based Test [11.826804773695033]
Existing metrics provide an $textitabsolute$ number of inconsistencies between the graph and the observed data. We propose a novel consistency metric by constructing a baseline through node permutations. By comparing the number of inconsistencies with those on the baseline, we derive an interpretable metric.
arXiv Detail & Related papers (2023-05-16T16:02:18Z)
Large-scale Point Cloud Registration Based on Graph Matching Optimization [30.92028761652611]
We propose a underlineGraph underlineMatching underlineOptimization based underlineNetwork. The proposed method has been evaluated on the 3DMatch/3DLoMatch benchmarks and the KITTI benchmark.
arXiv Detail & Related papers (2023-02-12T03:29:35Z)
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching [68.35685422301613]
We propose a novel non-parametric subgraph matching framework, dubbed MatchExplainer, to explore explanatory subgraphs. It couples the target graph with other counterpart instances and identifies the most crucial joint substructure by minimizing the node corresponding-based distance. Experiments on synthetic and real-world datasets show the effectiveness of our MatchExplainer by outperforming all state-of-the-art parametric baselines with significant margins.
arXiv Detail & Related papers (2023-01-07T05:14:45Z)
Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC) AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability. We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z)
FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations [114.94628499698096]
We propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MRs) MRs describe core semantic concepts and their relations, aggregating the main content in both document and summary in a canonical form, and reducing data sparsity. Experiments on different benchmarks for evaluating factuality show that FactGraph outperforms previous approaches by up to 15%.
arXiv Detail & Related papers (2022-04-13T16:45:33Z)
SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection [26.0630601028093]
Domain Adaptive Object Detection (DAOD) leverages a labeled domain to learn an object detector generalizing to a novel domain free of annotations. Recent advances align class-conditional distributions by narrowing down cross-domain prototypes (class centers) We propose a novel SemantIc-complete Graph MAtching framework for hallucinationD, which completes mismatched semantics and reformulates the adaptation with graph matching.
arXiv Detail & Related papers (2022-03-12T10:14:17Z)
Deep Probabilistic Graph Matching [72.6690550634166]
We propose a deep learning-based graph matching framework that works for the original QAP without compromising on the matching constraints. The proposed method is evaluated on three popularly tested benchmarks (Pascal VOC, Willow Object and SPair-71k) and it outperforms all previous state-of-the-arts on all benchmarks.
arXiv Detail & Related papers (2022-01-05T13:37:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.