Temporal Effects on Pre-trained Models for Language Processing Tasks
- URL: http://arxiv.org/abs/2111.12790v1
- Date: Wed, 24 Nov 2021 20:44:12 GMT
- Title: Temporal Effects on Pre-trained Models for Language Processing Tasks
- Authors: Oshin Agarwal and Ani Nenkova
- Abstract summary: We present a set of experiments with systems powered by large neural pretrained representations for English to demonstrate that em temporal model deterioration is not as big a concern.
It is however the case that em temporal domain adaptation is beneficial, with better performance for a given time period possible when the system is trained on temporally more recent data.
- Score: 9.819970078135343
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keeping the performance of language technologies optimal as time passes is of
great practical interest. Here we survey prior work concerned with the effect
of time on system performance, establishing more nuanced terminology for
discussing the topic and proper experimental design to support solid
conclusions about the observed phenomena. We present a set of experiments with
systems powered by large neural pretrained representations for English to
demonstrate that {\em temporal model deterioration} is not as big a concern,
with some models in fact improving when tested on data drawn from a later time
period. It is however the case that {\em temporal domain adaptation} is
beneficial, with better performance for a given time period possible when the
system is trained on temporally more recent data. Our experiments reveal that
the distinctions between temporal model deterioration and temporal domain
adaptation becomes salient for systems built upon pretrained representations.
Finally we examine the efficacy of two approaches for temporal domain
adaptation without human annotations on new data, with self-labeling proving to
be superior to continual pre-training. Notably, for named entity recognition,
self-labeling leads to better temporal adaptation than human annotation.
Related papers
- Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Probing the Robustness of Time-series Forecasting Models with
CounterfacTS [1.823020744088554]
We present and publicly release CounterfacTS, a tool to probe the robustness of deep learning models in time-series forecasting tasks.
CounterfacTS has a user-friendly interface that allows the user to visualize, compare and quantify time series data and their forecasts.
arXiv Detail & Related papers (2024-03-06T07:34:47Z) - Revisiting Dynamic Evaluation: Online Adaptation for Large Language
Models [88.47454470043552]
We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation.
Online adaptation turns parameters into temporally changing states and provides a form of context-length extension with memory in weights.
arXiv Detail & Related papers (2024-03-03T14:03:48Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Instructed Diffuser with Temporal Condition Guidance for Offline
Reinforcement Learning [71.24316734338501]
We propose an effective temporally-conditional diffusion model coined Temporally-Composable diffuser (TCD)
TCD extracts temporal information from interaction sequences and explicitly guides generation with temporal conditions.
Our method reaches or matches the best performance compared with prior SOTA baselines.
arXiv Detail & Related papers (2023-06-08T02:12:26Z) - Meta-Auxiliary Learning for Adaptive Human Pose Prediction [26.877194503491072]
Predicting high-fidelity future human poses is decisive for intelligent robots to interact with humans.
Deep end-to-end learning approaches, which typically train a generic pre-trained model on external datasets and then directly apply it to all test samples, remain non-optimal.
We propose a novel test-time adaptation framework that leverages two self-supervised auxiliary tasks to help the primary forecasting network adapt to the test sequence.
arXiv Detail & Related papers (2023-04-13T11:17:09Z) - Time Series Contrastive Learning with Information-Aware Augmentations [57.45139904366001]
A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples.
How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question.
We propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning.
arXiv Detail & Related papers (2023-03-21T15:02:50Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Time Waits for No One! Analysis and Challenges of Temporal Misalignment [42.106972477571226]
We establish a suite of eight diverse tasks across different domains to quantify the effects of temporal misalignment.
We find stronger effects of temporal misalignment on task performance than have been previously reported.
Our findings motivate continued research to improve temporal robustness of NLP models.
arXiv Detail & Related papers (2021-11-14T18:29:19Z) - Opinions are Made to be Changed: Temporally Adaptive Stance
Classification [9.061088449712859]
We introduce two novel large-scale, longitudinal stance datasets.
We evaluate the performance persistence of stance classifiers over time and demonstrate how it decays as the temporal gap between training and testing data increases.
We propose and compare several approaches to embedding adaptation and find that the Incremental Temporal Alignment (ITA) model leads to the best results in reducing performance drop over time.
arXiv Detail & Related papers (2021-08-27T19:47:31Z) - An Enhanced Adversarial Network with Combined Latent Features for
Spatio-Temporal Facial Affect Estimation in the Wild [1.3007851628964147]
This paper proposes a novel model that efficiently extracts both spatial and temporal features of the data by means of its enhanced temporal modelling based on latent features.
Our proposed model consists of three major networks, coined Generator, Discriminator, and Combiner, which are trained in an adversarial setting combined with curriculum learning to enable our adaptive attention modules.
arXiv Detail & Related papers (2021-02-18T04:10:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.