Add Noise, Tasks, or Layers? MaiNLP at the VarDial 2025 Shared Task on Norwegian Dialectal Slot and Intent Detection
- URL: http://arxiv.org/abs/2501.03870v1
- Date: Tue, 07 Jan 2025 15:36:35 GMT
- Title: Add Noise, Tasks, or Layers? MaiNLP at the VarDial 2025 Shared Task on Norwegian Dialectal Slot and Intent Detection
- Authors: Verena Blaschke, Felicia Körner, Barbara Plank,
- Abstract summary: Slot and intent detection is a classic natural language understanding task.
Many approaches for low-resource scenarios have not yet been applied to dialectal SID data.
We participate in the VarDial 2025 shared task on slot and intent detection in Norwegian varieties.
- Score: 22.89563355840371
- License:
- Abstract: Slot and intent detection (SID) is a classic natural language understanding task. Despite this, research has only more recently begun focusing on SID for dialectal and colloquial varieties. Many approaches for low-resource scenarios have not yet been applied to dialectal SID data, or compared to each other on the same datasets. We participate in the VarDial 2025 shared task on slot and intent detection in Norwegian varieties, and compare multiple set-ups: varying the training data (English, Norwegian, or dialectal Norwegian), injecting character-level noise, training on auxiliary tasks, and applying Layer Swapping, a technique in which layers of models fine-tuned on different datasets are assembled into a model. We find noise injection to be beneficial while the effects of auxiliary tasks are mixed. Though some experimentation was required to successfully assemble a model from layers, it worked surprisingly well; a combination of models trained on English and small amounts of dialectal data produced the most robust slot predictions. Our best models achieve 97.6% intent accuracy and 85.6% slot F1 in the shared task.
Related papers
- Improving Dialectal Slot and Intent Detection with Auxiliary Tasks: A Multi-Dialectal Bavarian Case Study [22.89563355840371]
We explore zero-shot transfer learning for slot and intent detection (SID)
We focus on multiple Bavarian dialects, for which we release a new dataset for the Munich dialect.
We evaluate models trained on auxiliary tasks in Bavarian, and compare joint multi-task learning with intermediate-task training.
We find that the included auxiliary tasks have a more positive effect on slot filling than intent classification, and that intermediate-task training yields more consistent performance gains.
arXiv Detail & Related papers (2025-01-07T15:21:07Z) - HiTZ at VarDial 2025 NorSID: Overcoming Data Scarcity with Language Transfer and Automatic Data Annotation [5.989003825349711]
We present our submission for the NorSID Shared Task, consisting of three tasks: Intent Detection, Slot Filling and Dialect Identification.
For Intent Detection and Slot Filling, we have fine-tuned a multitask model in a cross-lingual setting, to leverage the xSID dataset available in 17 languages.
In the case of Dialect Identification, our final submission consists of a model fine-tuned on the provided development set, which has obtained the highest scores.
arXiv Detail & Related papers (2024-12-13T12:31:06Z) - Robustification of Multilingual Language Models to Real-world Noise with
Robust Contrastive Pretraining [14.087882550564169]
We assess the robustness of neural models on noisy data and suggest improvements are limited to the English language.
To benchmark the performance of pretrained multilingual models, we construct noisy datasets covering five languages and four NLP tasks.
We propose Robust Contrastive Pretraining (RCP) to boost the zero-shot cross-lingual robustness of multilingual pretrained models.
arXiv Detail & Related papers (2022-10-10T15:40:43Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Sequence-level self-learning with multiple hypotheses [53.04725240411895]
We develop new self-learning techniques with an attention-based sequence-to-sequence (seq2seq) model for automatic speech recognition (ASR)
In contrast to conventional unsupervised learning approaches, we adopt the emphmulti-task learning (MTL) framework.
Our experiment results show that our method can reduce the WER on the British speech data from 14.55% to 10.36% compared to the baseline model trained with the US English data only.
arXiv Detail & Related papers (2021-12-10T20:47:58Z) - Intent Classification Using Pre-Trained Embeddings For Low Resource
Languages [67.40810139354028]
Building Spoken Language Understanding systems that do not rely on language specific Automatic Speech Recognition is an important yet less explored problem in language processing.
We present a comparative study aimed at employing a pre-trained acoustic model to perform Spoken Language Understanding in low resource scenarios.
We perform experiments across three different languages: English, Sinhala, and Tamil each with different data sizes to simulate high, medium, and low resource scenarios.
arXiv Detail & Related papers (2021-10-18T13:06:59Z) - From Masked Language Modeling to Translation: Non-English Auxiliary
Tasks Improve Zero-shot Spoken Language Understanding [24.149299722716155]
We introduce xSID, a new benchmark for cross-lingual Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect.
We propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer.
Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.
arXiv Detail & Related papers (2021-05-15T23:51:11Z) - Ranking Creative Language Characteristics in Small Data Scenarios [52.00161818003478]
We adapt the DirectRanker to provide a new deep model for ranking creative language with small data.
Our experiments with sparse training data show that while the performance of standard neural ranking approaches collapses with small datasets, DirectRanker remains effective.
arXiv Detail & Related papers (2020-10-23T18:57:47Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Combining Deep Learning and String Kernels for the Localization of Swiss
German Tweets [28.497747521078647]
We address the second subtask, which targets a data set composed of nearly 30 thousand Swiss German Jodels.
We frame the task as a double regression problem, employing a variety of machine learning approaches to predict both latitude and longitude.
Our empirical results indicate that the handcrafted model based on string kernels outperforms the deep learning approaches.
arXiv Detail & Related papers (2020-10-07T19:16:45Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.