Team \'UFAL at CMCL 2022 Shared Task: Figuring out the correct recipe
for predicting Eye-Tracking features using Pretrained Language Models
- URL: http://arxiv.org/abs/2204.04998v1
- Date: Mon, 11 Apr 2022 10:43:34 GMT
- Title: Team \'UFAL at CMCL 2022 Shared Task: Figuring out the correct recipe
for predicting Eye-Tracking features using Pretrained Language Models
- Authors: Sunit Bhattacharya, Rishu Kumar and Ondrej Bojar
- Abstract summary: We describe our systems for the CMCL 2022 shared task on predicting eye-tracking information.
Our submissions achieved an average MAE of 5.72 and ranked 5th in the shared task.
- Score: 9.087729124428467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Eye-Tracking data is a very useful source of information to study cognition
and especially language comprehension in humans. In this paper, we describe our
systems for the CMCL 2022 shared task on predicting eye-tracking information.
We describe our experiments with pretrained models like BERT and XLM and the
different ways in which we used those representations to predict four
eye-tracking features. Along with analysing the effect of using two different
kinds of pretrained multilingual language models and different ways of pooling
the tokenlevel representations, we also explore how contextual information
affects the performance of the systems. Finally, we also explore if factors
like augmenting linguistic information affect the predictions. Our submissions
achieved an average MAE of 5.72 and ranked 5th in the shared task. The average
MAE showed further reduction to 5.25 in post task evaluation.
Related papers
- Multilingual Diversity Improves Vision-Language Representations [66.41030381363244]
Pre-training on this dataset outperforms using English-only or English-dominated datasets on ImageNet.
On a geographically diverse task like GeoDE, we also observe improvements across all regions, with the biggest gain coming from Africa.
arXiv Detail & Related papers (2024-05-27T08:08:51Z) - Linguistic Knowledge Can Enhance Encoder-Decoder Models (If You Let It) [2.6150740794754155]
We investigate whether fine-tuning a T5 model on an intermediate task that predicts structural linguistic properties of sentences modifies its performance in the target task of predicting sentence-level complexity.
Results obtained for both languages and in cross-lingual configurations show that linguistically motivated intermediate fine-tuning has generally a positive impact on target task performance, especially when applied to smaller models and in scenarios with limited data availability.
arXiv Detail & Related papers (2024-02-27T15:34:15Z) - What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - Localization vs. Semantics: Visual Representations in Unimodal and
Multimodal Models [57.08925810659545]
We conduct a comparative analysis of the visual representations in existing vision-and-language models and vision-only models.
Our empirical observations suggest that vision-and-language models are better at label prediction tasks.
We hope our study sheds light on the role of language in visual learning, and serves as an empirical guide for various pretrained models.
arXiv Detail & Related papers (2022-12-01T05:00:18Z) - CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language
Representation Alignment [146.3128011522151]
We propose a Omni Crossmodal Learning method equipped with a Video Proxy mechanism on the basis of CLIP, namely CLIP-ViP.
Our approach improves the performance of CLIP on video-text retrieval by a large margin.
Our model also achieves SOTA results on a variety of datasets, including MSR-VTT, DiDeMo, LSMDC, and ActivityNet.
arXiv Detail & Related papers (2022-09-14T05:47:02Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Zero and Few-shot Learning for Author Profiling [4.208594148115529]
Author profiling classifies author characteristics by analyzing how language is shared among people.
We explore different zero and few-shot models based on entailment and evaluate our systems on several profiling tasks in Spanish and English.
arXiv Detail & Related papers (2022-04-22T07:22:37Z) - UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles
for Detecting Patronizing and Condescending Language [0.0]
Patronizing and condescending language (PCL) is everywhere, but rarely is the focus on its use by media towards vulnerable communities.
In this paper, we describe our system for detecting such language which was submitted to SemEval 2022 Task 4: Patronizing and Condescending Language Detection.
arXiv Detail & Related papers (2022-04-18T13:22:10Z) - Zero Shot Crosslingual Eye-Tracking Data Prediction using Multilingual
Transformer Models [0.0]
We describe our submission to the CMCL 2022 shared task on predicting human reading patterns for multi-lingual dataset.
Our model uses text representations from transformers and some hand engineered features with a regression layer on top to predict statistical measures of mean and standard deviation.
We train an end to end model to extract meaningful information from different languages and test our model on two seperate datasets.
arXiv Detail & Related papers (2022-03-30T17:11:48Z) - TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning
for Eye-Tracking Prediction [25.99947358445936]
We describe our submission to the CMCL 2021 shared task on predicting human reading patterns.
Our model uses RoBERTa with a regression layer to predict 5 eye-tracking features.
Our final submission achieves a MAE score of 3.929, ranking 3rd place out of 13 teams that participated in this task.
arXiv Detail & Related papers (2021-04-15T05:29:13Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.