Related papers: An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

URL: http://arxiv.org/abs/2402.13534v1
Date: Wed, 21 Feb 2024 05:04:29 GMT
Title: An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling
Authors: Xuemei Tang and Qi Su
Abstract summary: We propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks. The framework enhances training by gradually introducing data instances from easy to hard, aiming to improve both performance and training speed.
Score: 9.237399190335598
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Sequence labeling models often benefit from incorporating external knowledge. However, this practice introduces data heterogeneity and complicates the model with additional modules, leading to increased expenses for training a high-performing model. To address this challenge, we propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks. The TCL framework enhances training by gradually introducing data instances from easy to hard, aiming to improve both performance and training speed. Furthermore, we explore different metrics for assessing the difficulty levels of sequence labeling tasks. Through extensive experimentation on six Chinese word segmentation (CWS) and Part-of-speech tagging (POS) datasets, we demonstrate the effectiveness of our model in enhancing the performance of sequence labeling models. Additionally, our analysis indicates that TCL accelerates training and alleviates the slow training problem associated with complex models.

Related papers

Being Strong Progressively! Enhancing Knowledge Distillation of Large Language Models through a Curriculum Learning Framework [0.0]
Knowledge Distillation (KD) compresses large language models (LLMs) by transferring the teacher model's capabilities to a smaller student model.<n>Existing KD methods for LLMs often fail to prevent significant shifts in the student model's distribution during training.<n>We propose a novel, plug-in curriculum learning framework inspired by the strength training principle of "progressive overload"
arXiv Detail & Related papers (2025-06-06T02:48:38Z)
Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning [43.12759195699103]
Large Language Models (LLMs) have achieved remarkable performance across various reasoning tasks, yet post-training is constrained by inefficient sample utilization and inflexible difficulty samples processing.<n>We propose Customized Curriculum Learning (CCL), a novel framework with two key innovations.<n>First, we introduce model-adaptive difficulty definition that customizes curriculum datasets based on each model's individual capabilities rather than using predefined difficulty metrics.<n>Second, we develop "Guided Prompting," which dynamically reduces sample difficulty through strategic hints, enabling effective utilization of challenging samples that would otherwise degrade performance.
arXiv Detail & Related papers (2025-06-04T15:31:46Z)
AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences. Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation. We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z)
How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples. We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics. When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z)
A Psychology-based Unified Dynamic Framework for Curriculum Learning [5.410910735259908]
This paper presents a Psychology-based Unified Dynamic Framework for Curriculum Learning (PUDF) We quantify the difficulty of training data by applying Item Response Theory (IRT) to responses from Artificial Crowds (AC) We propose a Dynamic Data Selection via Model Ability Estimation (DDS-MAE) strategy to schedule the appropriate amount of data during model training.
arXiv Detail & Related papers (2024-08-09T20:30:37Z)
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL) We take the first step by examining the pre-training dynamics of the emergence of ICL. We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z)
Reinforcement Learning for Topic Models [3.42658286826597]
We apply reinforcement learning techniques to topic modeling by replacing the variational autoencoder in ProdLDA with a continuous action space reinforcement learning policy. We introduce several modifications: modernize the neural network architecture, weight the ELBO loss, use contextual embeddings, and monitor the learning process via computing topic diversity and coherence.
arXiv Detail & Related papers (2023-05-08T16:41:08Z)
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU [19.42920238320109]
Curriculum Learning (CL) is a technique of training models via ranking examples in a typically increasing difficulty trend. In this work, we employ CL for Natural Language Understanding (NLU) tasks by taking advantage of training dynamics as difficulty metrics. Experiments indicate that training dynamics can lead to better performing models with smoother training compared to other difficulty metrics.
arXiv Detail & Related papers (2022-10-22T17:10:04Z)
Towards Sequence-Level Training for Visual Tracking [60.95799261482857]
This work introduces a sequence-level training strategy for visual tracking based on reinforcement learning. Four representative tracking models, SiamRPN++, SiamAttn, TransT, and TrDiMP, consistently improve by incorporating the proposed methods in training.
arXiv Detail & Related papers (2022-08-11T13:15:36Z)
Dynamic Supervisor for Cross-dataset Object Detection [52.95818230087297]
Cross-dataset training in object detection tasks is complicated because the inconsistency in the category range across datasets transforms fully supervised learning into semi-supervised learning. We propose a dynamic supervisor framework that updates the annotations multiple times through multiple-updated submodels trained using hard and soft labels. In the final generated annotations, both recall and precision improve significantly through the integration of hard-label training with soft-label training.
arXiv Detail & Related papers (2022-04-01T03:18:46Z)
Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models [51.744357472072416]
We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances. Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models.
arXiv Detail & Related papers (2021-05-31T09:17:33Z)
Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm. We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data. Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.