Examining and Adapting Time for Multilingual Classification via Mixture of Temporal Experts
- URL: http://arxiv.org/abs/2502.08825v1
- Date: Wed, 12 Feb 2025 22:30:18 GMT
- Title: Examining and Adapting Time for Multilingual Classification via Mixture of Temporal Experts
- Authors: Weisi Liu, Guangzeng Han, Xiaolei Huang,
- Abstract summary: We develop a framework to generalize classifiers over time on multiple languages.<n>Our analysis shows classification performance varies over time across different languages.<n>Our study provides analytic insights and addresses the need for time-aware models.
- Score: 4.796752450839119
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Time is implicitly embedded in classification process: classifiers are usually built on existing data while to be applied on future data whose distributions (e.g., label and token) may change. However, existing state-of-the-art classification models merely consider the temporal variations and primarily focus on English corpora, which leaves temporal studies less explored, let alone under multilingual settings. In this study, we fill the gap by treating time as domains (e.g., 2024 vs. 2025), examining temporal effects, and developing a domain adaptation framework to generalize classifiers over time on multiple languages. Our framework proposes Mixture of Temporal Experts (MoTE) to leverage both semantic and data distributional shifts to learn and adapt temporal trends into classification models. Our analysis shows classification performance varies over time across different languages, and we experimentally demonstrate that MoTE can enhance classifier generalizability over temporal data shifts. Our study provides analytic insights and addresses the need for time-aware models that perform robustly in multilingual scenarios.
Related papers
- TimesBERT: A BERT-Style Foundation Model for Time Series Understanding [72.64824086839631]
GPT-style models have been positioned as foundation models for time series forecasting.
BERT-style architecture has not been fully unlocked for time series understanding.
We design TimesBERT to learn generic representations of time series.
Our model is pre-trained on 260 billion time points across diverse domains.
arXiv Detail & Related papers (2025-02-28T17:14:44Z) - Large Language Models For Text Classification: Case Study And Comprehensive Review [0.3428444467046467]
We evaluate the performance of different Large Language Models (LLMs) in comparison with state-of-the-art deep-learning and machine-learning models.<n>Our work reveals significant variations in model responses based on the prompting strategies.
arXiv Detail & Related papers (2025-01-14T22:02:38Z) - VSFormer: Value and Shape-Aware Transformer with Prior-Enhanced Self-Attention for Multivariate Time Series Classification [47.92529531621406]
We propose a novel method, VSFormer, that incorporates both discriminative patterns (shape) and numerical information (value)<n>In addition, we extract class-specific prior information derived from supervised information to enrich the positional encoding.<n>Extensive experiments on all 30 UEA archived datasets demonstrate the superior performance of our method compared to SOTA models.
arXiv Detail & Related papers (2024-12-21T07:31:22Z) - LLMTemporalComparator: A Tool for Analysing Differences in Temporal Adaptations of Large Language Models [17.021220773165016]
This study addresses the challenges of analyzing temporal discrepancies in large language models (LLMs) trained on data from different time periods.
We propose a novel system that compares in a systematic way the outputs of two LLM versions based on user-defined queries.
arXiv Detail & Related papers (2024-10-05T15:17:07Z) - A Systematic Analysis on the Temporal Generalization of Language Models in Social Media [12.035331011654078]
This paper focuses on temporal shifts in social media and, in particular, Twitter.
We propose a unified evaluation scheme to assess the performance of language models (LMs) under temporal shift.
arXiv Detail & Related papers (2024-05-15T05:41:06Z) - How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study [59.13867562744973]
This work systematically assesses LMs' capabilities for out-of-distribution (OOD) scenarios.
We find that the efficacy of such learning paradigms varies with the type of OOD.
Specifically, while ICL excels for domain shifts, prompt-based fine-tuning surpasses for topic shifts.
arXiv Detail & Related papers (2023-09-15T11:15:47Z) - Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating
Generalization Capacity of Language Models [18.874880342410876]
We present Jamp, a Japanese benchmark focused on temporal inference.
Our dataset includes a range of temporal inference patterns, which enables us to conduct fine-grained analysis.
We evaluate the generalization capacities of monolingual/multilingual LMs by splitting our dataset based on tense fragments.
arXiv Detail & Related papers (2023-06-19T07:00:14Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - On Cross-Lingual Retrieval with Multilingual Text Encoders [51.60862829942932]
We study the suitability of state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
We benchmark their performance in unsupervised ad-hoc sentence- and document-level CLIR experiments.
We evaluate multilingual encoders fine-tuned in a supervised fashion (i.e., we learn to rank) on English relevance data in a series of zero-shot language and domain transfer CLIR experiments.
arXiv Detail & Related papers (2021-12-21T08:10:27Z) - Linguistic Typology Features from Text: Inferring the Sparse Features of
World Atlas of Language Structures [73.06435180872293]
We construct a recurrent neural network predictor based on byte embeddings and convolutional layers.
We show that some features from various linguistic types can be predicted reliably.
arXiv Detail & Related papers (2020-04-30T21:00:53Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.