Computational Semantics and Evaluation Benchmark for Interrogative
Sentences via Combinatory Categorial Grammar
- URL: http://arxiv.org/abs/2312.14737v1
- Date: Fri, 22 Dec 2023 14:46:02 GMT
- Title: Computational Semantics and Evaluation Benchmark for Interrogative
Sentences via Combinatory Categorial Grammar
- Authors: Hayate Funakura, Koji Mineshima
- Abstract summary: We present a compositional semantics for various types of polar questions and wh-questions within the framework of Combinatory Categorial Grammar (CCG)
We introduce a question-answering dataset QSEM specifically designed to evaluate the semantics of interrogative sentences.
- Score: 8.666172545138275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a compositional semantics for various types of polar questions and
wh-questions within the framework of Combinatory Categorial Grammar (CCG). To
assess the explanatory power of our proposed analysis, we introduce a
question-answering dataset QSEM specifically designed to evaluate the semantics
of interrogative sentences. We implement our analysis using existing CCG
parsers and conduct evaluations using the dataset. Through the evaluation, we
have obtained annotated data with CCG trees and semantic representations for
about half of the samples included in QSEM. Furthermore, we discuss the
discrepancy between the theoretical capacity of CCG and the capabilities of
existing CCG parsers.
Related papers
- Automated Speaking Assessment of Conversation Tests with Novel Graph-based Modeling on Spoken Response Coherence [11.217656140423207]
ASAC aims to evaluate the overall speaking proficiency of an L2 speaker in a setting where an interlocutor interacts with one or more candidates.
We propose a hierarchical graph model that aptly incorporates both broad inter-response interactions and nuanced semantic information.
Extensive experimental results on the NICT-JLE benchmark dataset suggest that our proposed modeling approach can yield considerable improvements in prediction accuracy.
arXiv Detail & Related papers (2024-09-11T07:24:07Z) - QUDSELECT: Selective Decoding for Questions Under Discussion Parsing [90.92351108691014]
Question Under Discussion (QUD) is a discourse framework that uses implicit questions to reveal discourse relationships between sentences.
We introduce QUDSELECT, a joint-training framework that selectively decodes the QUD dependency structures considering the QUD criteria.
Our method outperforms the state-of-the-art baseline models by 9% in human evaluation and 4% in automatic evaluation.
arXiv Detail & Related papers (2024-08-02T06:46:08Z) - Benchmarking Large Language Models in Complex Question Answering
Attribution using Knowledge Graphs [35.089203283068635]
We introduce a set of fine-grained categories for measuring the attribution, and develop a Complex Attributed Question Answering (CAQA) benchmark.
Our analysis reveals that existing evaluators perform poorly under fine-grained attribution settings and exhibit weaknesses in complex citation-statement reasoning.
arXiv Detail & Related papers (2024-01-26T04:11:07Z) - Bayesian Networks for Named Entity Prediction in Programming Community
Question Answering [0.0]
We propose a new approach for natural language processing using Bayesian networks to predict and analyze the context.
We compare the Bayesian networks with different score metrics, such as the BIC, BDeu, K2 and Chow-Liu trees.
In addition, we examine the visualization of directed acyclic graphs to analyze semantic relationships.
arXiv Detail & Related papers (2023-02-26T07:26:36Z) - Towards Interpretable Summary Evaluation via Allocation of Contextual
Embeddings to Reference Text Topics [1.5749416770494706]
The multifaceted interpretable summary evaluation method (MISEM) is based on allocation of a summary's contextual token embeddings to semantic topics identified in the reference text.
MISEM achieves a promising.404 Pearson correlation with human judgment on the TAC'08 dataset.
arXiv Detail & Related papers (2022-10-25T17:09:08Z) - Discourse Analysis via Questions and Answers: Parsing Dependency
Structures of Questions Under Discussion [57.43781399856913]
This work adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis.
We characterize relationships between sentences as free-form questions, in contrast to exhaustive fine-grained questions.
We develop the first-of-its-kind QUD that derives a dependency structure of questions over full documents.
arXiv Detail & Related papers (2022-10-12T03:53:12Z) - DisCoDisCo at the DISRPT2021 Shared Task: A System for Discourse
Segmentation, Classification, and Connective Detection [4.371388370559826]
Our system, called DisCoDisCo, enhances contextualized word embeddings with hand-crafted features.
Results on relation classification suggest strong performance on the new 2021 benchmark.
A partial evaluation of multiple pre-trained Transformer-based language models indicates that models pre-trained on the Next Sentence Prediction task are optimal for relation classification.
arXiv Detail & Related papers (2021-09-20T18:11:05Z) - CoPHE: A Count-Preserving Hierarchical Evaluation Metric in Large-Scale
Multi-Label Text Classification [70.554573538777]
We argue for hierarchical evaluation of the predictions of neural LMTC models.
We describe a structural issue in the representation of the structured label space in prior art.
We propose a set of metrics for hierarchical evaluation using the depth-based representation.
arXiv Detail & Related papers (2021-09-10T13:09:12Z) - Coherent Hierarchical Multi-Label Classification Networks [56.41950277906307]
C-HMCNN(h) is a novel approach for HMC problems, which exploits hierarchy information in order to produce predictions coherent with the constraint and improve performance.
We conduct an extensive experimental analysis showing the superior performance of C-HMCNN(h) when compared to state-of-the-art models.
arXiv Detail & Related papers (2020-10-20T09:37:02Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Towards Question-Answering as an Automatic Metric for Evaluating the
Content Quality of a Summary [65.37544133256499]
We propose a metric to evaluate the content quality of a summary using question-answering (QA)
We demonstrate the experimental benefits of QA-based metrics through an analysis of our proposed metric, QAEval.
arXiv Detail & Related papers (2020-10-01T15:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.