The DURel Annotation Tool: Human and Computational Measurement of
Semantic Proximity, Sense Clusters and Semantic Change
- URL: http://arxiv.org/abs/2311.12664v2
- Date: Mon, 5 Feb 2024 12:50:23 GMT
- Title: The DURel Annotation Tool: Human and Computational Measurement of
Semantic Proximity, Sense Clusters and Semantic Change
- Authors: Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma
Sk\"oldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn,
Sabine Schulte im Walde
- Abstract summary: The DURel tool implements the annotation of semantic proximity between uses of words into an online, open source interface.
The tool supports standardized human annotation as well as computational annotation, building on recent advances with Word-in-Context models.
- Score: 13.80701224074806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the DURel tool that implements the annotation of semantic
proximity between uses of words into an online, open source interface. The tool
supports standardized human annotation as well as computational annotation,
building on recent advances with Word-in-Context models. Annotator judgments
are clustered with automatic graph clustering techniques and visualized for
analysis. This allows to measure word senses with simple and intuitive
micro-task judgments between use pairs, requiring minimal preparation efforts.
The tool offers additional functionalities to compare the agreement between
annotators to guarantee the inter-subjectivity of the obtained judgments and to
calculate summary statistics giving insights into sense frequency
distributions, semantic variation or changes of senses over time.
Related papers
- Graph-based Clustering for Detecting Semantic Change Across Time and
Languages [10.058655884092094]
We propose a graph-based clustering approach to capture nuanced changes in both high- and low-frequency word senses across time and languages.
Our approach substantially surpasses previous approaches in the SemEval 2020 binary classification task across four languages.
arXiv Detail & Related papers (2024-02-01T21:27:19Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - Measuring the Interpretability of Unsupervised Representations via
Quantized Reverse Probing [97.70862116338554]
We investigate the problem of measuring interpretability of self-supervised representations.
We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts.
We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability.
arXiv Detail & Related papers (2022-09-07T16:18:50Z) - Visual Comparison of Language Model Adaptation [55.92129223662381]
adapters are lightweight alternatives for model adaptation.
In this paper, we discuss several design and alternatives for interactive, comparative visual explanation methods.
We show that, for instance, an adapter trained on the language debiasing task according to context-0 embeddings introduces a new type of bias.
arXiv Detail & Related papers (2022-08-17T09:25:28Z) - FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable.
We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z) - Analysis of Joint Speech-Text Embeddings for Semantic Matching [3.6423306784901235]
We study a joint speech-text embedding space trained for semantic matching by minimizing the distance between paired utterance and transcription inputs.
We extend our method to incorporate automatic speech recognition through both pretraining and multitask scenarios.
arXiv Detail & Related papers (2022-04-04T04:50:32Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Improving Text Generation Evaluation with Batch Centering and Tempered
Word Mover Distance [24.49032191669509]
We present two techniques for improving encoding representations for similarity metrics.
We show results over various BERT-backbone learned metrics and achieving state of the art correlation with human ratings on several benchmarks.
arXiv Detail & Related papers (2020-10-13T03:46:25Z) - Constructing interval variables via faceted Rasch measurement and
multitask deep learning: a hate speech application [63.10266319378212]
We propose a method for measuring complex variables on a continuous, interval spectrum by combining supervised deep learning with the Constructing Measures approach to faceted Rasch item response theory (IRT)
We demonstrate this new method on a dataset of 50,000 social media comments sourced from YouTube, Twitter, and Reddit and labeled by 11,000 U.S.-based Amazon Mechanical Turk workers.
arXiv Detail & Related papers (2020-09-22T02:15:05Z) - Analysing Lexical Semantic Change with Contextualised Word
Representations [7.071298726856781]
We propose a novel method that exploits the BERT neural language model to obtain representations of word usages.
We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements.
arXiv Detail & Related papers (2020-04-29T12:18:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.