Related papers: UZH_CLyp at SemEval-2023 Task 9: Head-First Fine-Tuning and ChatGPT Data Generation for Cross-Lingual Learning in Tweet Intimacy Prediction

UZH_CLyp at SemEval-2023 Task 9: Head-First Fine-Tuning and ChatGPT Data Generation for Cross-Lingual Learning in Tweet Intimacy Prediction

URL: http://arxiv.org/abs/2303.01194v2
Date: Mon, 24 Apr 2023 12:19:56 GMT
Title: UZH_CLyp at SemEval-2023 Task 9: Head-First Fine-Tuning and ChatGPT Data Generation for Cross-Lingual Learning in Tweet Intimacy Prediction
Authors: Andrianos Michail, Stefanos Konstantinou, Simon Clematide
Abstract summary: This paper describes the submission of UZH_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Intimacy Analysis" We achieved second-best results in all 10 languages according to the official Pearson's correlation regression evaluation measure.
Score: 3.1798318618973362
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper describes the submission of UZH_CLyp for the SemEval 2023 Task 9 "Multilingual Tweet Intimacy Analysis". We achieved second-best results in all 10 languages according to the official Pearson's correlation regression evaluation measure. Our cross-lingual transfer learning approach explores the benefits of using a Head-First Fine-Tuning method (HeFiT) that first updates only the regression head parameters and then also updates the pre-trained transformer encoder parameters at a reduced learning rate. Additionally, we study the impact of using a small set of automatically generated examples (in our case, from ChatGPT) for low-resource settings where no human-labeled data is available. Our study shows that HeFiT stabilizes training and consistently improves results for pre-trained models that lack domain adaptation to tweets. Our study also shows a noticeable performance increase in cross-lingual learning when synthetic data is used, confirming the usefulness of current text generation systems to improve zero-shot baseline results. Finally, we examine how possible inconsistencies in the annotated data contribute to cross-lingual interference issues.

Related papers

Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models [107.24906866038431]
We propose REWIRE, REcycling the Web with guIded REwrite, to enrich low-quality documents so that they could become useful for training.<n>We show that mixing high-quality raw texts and our rewritten texts lead to 1.0, 1.3 and 2.5 percentage points improvement respectively across 22 diverse tasks.
arXiv Detail & Related papers (2025-06-05T07:12:12Z)
From Text to Treatment Effects: A Meta-Learning Approach to Handling Text-Based Confounding [7.5348062792]
This paper examines the performance of meta-learners when confounding variables are expressed in text. We show that learners using pre-trained text representations of confounders achieve improved CATE estimates. Due to the entangled nature of the text embeddings, these models do not fully match the performance of meta-learners with perfect confounder knowledge.
arXiv Detail & Related papers (2024-09-23T19:46:19Z)
Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams [49.3179290313959]
This study explores the efficacy of seven text sampling methods designed to selectively fine-tune language models. We precisely assess the impact of these methods on fine-tuning the SBERT model using four different loss functions. Our findings indicate that Softmax loss and Batch All Triplets loss are particularly effective for text stream classification.
arXiv Detail & Related papers (2024-03-18T23:41:52Z)
Influence Scores at Scale for Efficient Language Data Sampling [3.072340427031969]
"influence scores" are used to identify important subsets of data. In this paper, we explore the applicability of influence scores in language classification tasks.
arXiv Detail & Related papers (2023-11-27T20:19:22Z)
Strategies for improving low resource speech to text translation relying on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST) We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z)
WADER at SemEval-2023 Task 9: A Weak-labelling framework for Data augmentation in tExt Regression Tasks [4.102007186133394]
In this paper, we propose a novel weak-labeling strategy for data augmentation in text regression tasks called WADER. We benchmark the performance of State-of-the-Art pre-trained multilingual language models using WADER and analyze the use of sampling techniques to mitigate bias in data.
arXiv Detail & Related papers (2023-03-05T19:45:42Z)
Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data. We design a simple but effective ensemble-based framework that combines various transfer learning techniques. We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z)
A study on the efficacy of model pre-training in developing neural text-to-speech system [55.947807261757056]
This study aims to understand better why and how model pre-training can positively contribute to TTS system performance. It is found that the TTS system could achieve comparable performance when the pre-training data is reduced to 1/8 of its original size.
arXiv Detail & Related papers (2021-10-08T02:09:28Z)
How much pretraining data do language models need to learn syntax? [12.668478784932878]
Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks. We study the impact of pretraining data size on the knowledge of the models using RoBERTa.
arXiv Detail & Related papers (2021-09-07T15:51:39Z)
On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice. By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data. We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z)
Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results. We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks. Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.