LSCDiscovery: A shared task on semantic change discovery and detection
in Spanish
- URL: http://arxiv.org/abs/2205.06691v1
- Date: Fri, 13 May 2022 14:52:18 GMT
- Title: LSCDiscovery: A shared task on semantic change discovery and detection
in Spanish
- Authors: Frank D. Zamora-Reina, Felipe Bravo-Marquez, Dominik Schlechtweg
- Abstract summary: We present the first shared task on semantic change discovery and detection in Spanish.
We create the first dataset of Spanish words manually annotated for semantic change using the DURel framework.
We describe the systems developed by the competing teams, highlighting the techniques that were particularly useful and discuss the limits of these approaches.
- Score: 12.85253662018234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present the first shared task on semantic change discovery and detection
in Spanish and create the first dataset of Spanish words manually annotated for
semantic change using the DURel framework (Schlechtweg et al., 2018). The task
is divided in two phases: 1) Graded Change Discovery, and 2) Binary Change
Detection. In addition to introducing a new language the main novelty with
respect to the previous tasks consists in predicting and evaluating changes for
all vocabulary words in the corpus. Six teams participated in phase 1 and seven
teams in phase 2 of the shared task, and the best system obtained a Spearman
rank correlation of 0.735 for phase 1 and an F1 score of 0.716 for phase 2. We
describe the systems developed by the competing teams, highlighting the
techniques that were particularly useful and discuss the limits of these
approaches.
Related papers
- Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Handshakes AI Research at CASE 2021 Task 1: Exploring different
approaches for multilingual tasks [0.22940141855172036]
The aim of the CASE 2021 Shared Task 1 was to detect and classify socio-political and crisis event information in a multilingual setting.
Our submission contained entries in all of the subtasks, and the scores obtained validated our research finding.
arXiv Detail & Related papers (2021-10-29T07:58:49Z) - MCL@IITK at SemEval-2021 Task 2: Multilingual and Cross-lingual
Word-in-Context Disambiguation using Augmented Data, Signals, and
Transformers [1.869621561196521]
We present our approach for solving the SemEval 2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC)
The goal is to detect whether a given word common to both the sentences evokes the same meaning.
We submit systems for both the settings - Multilingual and Cross-Lingual.
arXiv Detail & Related papers (2021-04-04T08:49:28Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Cross-Lingual Transfer Learning for Complex Word Identification [0.3437656066916039]
Complex Word Identification (CWI) is a task centered on detecting hard-to-understand words in texts.
Our approach uses zero-shot, one-shot, and few-shot learning techniques, alongside state-of-the-art solutions for Natural Language Processing (NLP) tasks.
Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment.
arXiv Detail & Related papers (2020-10-02T17:09:47Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social
Media Texts using Transformers and Multi-Task Learning [1.7196613099537055]
We describe the systems developed by our team for SemEval-2020 Task 9.
We aim to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
Our approach achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850.
For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants.
arXiv Detail & Related papers (2020-09-06T17:19:18Z) - SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection [10.606357227329822]
Evaluation is currently the most pressing problem in Lexical Semantic Change detection.
No gold standards are available to the community, which hinders progress.
We present the results of the first shared task that addresses this gap.
arXiv Detail & Related papers (2020-07-22T14:37:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.