Quirk or Palmer: A Comparative Study of Modal Verb Frameworks with
Annotated Datasets
- URL: http://arxiv.org/abs/2212.10152v1
- Date: Tue, 20 Dec 2022 10:44:18 GMT
- Title: Quirk or Palmer: A Comparative Study of Modal Verb Frameworks with
Annotated Datasets
- Authors: Risako Owan, Maria Gini, Dongyeop Kang
- Abstract summary: Linguists have yet to agree on a single framework for the categorization of modal verb senses.
This work presents Moverb dataset, which consists of 27,240 annotations of modal verb senses over 4,540 utterances.
- Score: 8.09076910034882
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modal verbs, such as "can", "may", and "must", are commonly used in daily
communication to convey the speaker's perspective related to the likelihood
and/or mode of the proposition. They can differ greatly in meaning depending on
how they're used and the context of a sentence (e.g. "They 'must' help each
other out." vs. "They 'must' have helped each other out.") Despite their
practical importance in natural language understanding, linguists have yet to
agree on a single, prominent framework for the categorization of modal verb
senses. This lack of agreement stems from high degrees of flexibility and
polysemy from the modal verbs, making it more difficult for researchers to
incorporate insights from this family of words into their work. This work
presents Moverb dataset, which consists of 27,240 annotations of modal verb
senses over 4,540 utterances containing one or more sentences from social
conversations. Each utterance is annotated by three annotators using two
different theoretical frameworks (i.e., Quirk and Palmer) of modal verb senses.
We observe that both frameworks have similar inter-annotator agreements,
despite having different numbers of sense types (8 for Quirk and 3 for Palmer).
With the RoBERTa-based classifiers fine-tuned on \dataset, we achieve F1 scores
of 82.2 and 78.3 on Quirk and Palmer, respectively, showing that modal verb
sense disambiguation is not a trivial task. Our dataset will be publicly
available with our final version.
Related papers
- Contextualized word senses: from attention to compositionality [0.10878040851637999]
We propose a transparent, interpretable, and linguistically motivated strategy for encoding the contextual sense of words.
Particular attention is given to dependency relations and semantic notions such as selection preferences and paradigmatic classes.
arXiv Detail & Related papers (2023-12-01T16:04:00Z) - Distributed Marker Representation for Ambiguous Discourse Markers and
Entangled Relations [50.31129784616845]
We learn a Distributed Marker Representation (DMR) by utilizing the unlimited discourse marker data with a latent discourse sense.
Our method also offers a valuable tool to understand complex ambiguity and entanglement among discourse markers and manually defined discourse relations.
arXiv Detail & Related papers (2023-06-19T00:49:51Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Not wacky vs. definitely wacky: A study of scalar adverbs in pretrained
language models [0.0]
Modern pretrained language models, such as BERT, RoBERTa and GPT-3 hold the promise of performing better on logical tasks than classic static word embeddings.
We investigate the extent to which BERT, RoBERTa, GPT-2 and GPT-3 exhibit general, human-like, knowledge of these common words.
We find that despite capturing some aspects of logical meaning, the models fall far short of human performance.
arXiv Detail & Related papers (2023-05-25T18:56:26Z) - We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding.
We characterize ambiguity in a sentence by its effect on entailment relations with another sentence.
We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - RAW-C: Relatedness of Ambiguous Words--in Context (A New Lexical
Resource for English) [2.792030485253753]
We evaluate how well contextualized embeddings accommodate the continuous, dynamic nature of word meaning.
We show that cosine distance systematically underestimates how similar humans find uses of the same sense of a word to be.
We propose a synthesis between psycholinguistic theories of the mental lexicon and computational models of lexical semantics.
arXiv Detail & Related papers (2021-05-27T16:07:13Z) - "I'd rather just go to bed": Understanding Indirect Answers [61.234722570671686]
We revisit a pragmatic inference problem in dialog: understanding indirect responses to questions.
We create and release the first large-scale English language corpus 'Circa' with 34,268 (polar question, indirect answer) pairs.
We present BERT-based neural models to predict such categories for a question-answer pair.
arXiv Detail & Related papers (2020-10-07T14:41:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.