Leveraging Hierarchical Taxonomies in Prompt-based Continual Learning
- URL: http://arxiv.org/abs/2410.04327v4
- Date: Sat, 08 Mar 2025 22:31:27 GMT
- Title: Leveraging Hierarchical Taxonomies in Prompt-based Continual Learning
- Authors: Quyen Tran, Hoang Phan, Minh Le, Tuan Truong, Dinh Phung, Linh Ngo, Thien Nguyen, Nhat Ho, Trung Le,
- Abstract summary: We find that applying human habits of organizing and connecting information can serve as an efficient strategy when training deep learning models.<n>We propose a novel regularization loss function that encourages models to focus more on challenging knowledge areas.
- Score: 41.13568563835089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans perceive the world as a series of sequential events, which can be hierarchically organized with different levels of abstraction based on conceptual knowledge. Drawing inspiration from human learning behaviors, this work proposes a novel approach to mitigate catastrophic forgetting in Prompt-based Continual Learning models by exploiting the relationships between continuously emerging class data. We find that applying human habits of organizing and connecting information can serve as an efficient strategy when training deep learning models. Specifically, by building a hierarchical tree structure based on the expanding set of labels, we gain fresh insights into the data, identifying groups of similar classes could easily cause confusion. Additionally, we delve deeper into the hidden connections between classes by exploring the original pretrained model's behavior through an optimal transport-based approach. From these insights, we propose a novel regularization loss function that encourages models to focus more on challenging knowledge areas, thereby enhancing overall performance. Experimentally, our method demonstrated significant superiority over the most robust state-of-the-art models on various benchmarks.
Related papers
- Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning [51.0864247376786]
We introduce a Knowledge Graph Enhanced Generative Multi-modal model (KG-GMM) that builds an evolving knowledge graph throughout the learning process.
During testing, we propose a Knowledge Graph Augmented Inference method that locates specific categories by analyzing relationships within the generated text.
arXiv Detail & Related papers (2025-03-24T07:20:43Z) - Hypernym Bias: Unraveling Deep Classifier Training Dynamics through the Lens of Class Hierarchy [44.99833362998488]
We argue that the learning process in classification problems can be understood through the lens of label clustering.
Specifically, we observe that networks tend to distinguish higher-level (hypernym) categories in the early stages of training.
We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges.
arXiv Detail & Related papers (2025-02-17T18:47:01Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Preserving Generalization of Language models in Few-shot Continual Relation Extraction [34.68364639170838]
Few-shot Continual Relations Extraction (FCRE) is an emerging and dynamic area of study.
We introduce a novel method that leverages often-discarded language model heads.
Our experimental results underscore the efficacy of the proposed method and offer valuable insights for future work.
arXiv Detail & Related papers (2024-10-01T02:22:34Z) - Reinforcement Learning Based Multi-modal Feature Fusion Network for
Novel Class Discovery [47.28191501836041]
In this paper, we employ a Reinforcement Learning framework to simulate the cognitive processes of humans.
We also deploy a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information.
We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets.
arXiv Detail & Related papers (2023-08-26T07:55:32Z) - Class-relation Knowledge Distillation for Novel Class Discovery [16.461242381109276]
Key challenge lies in transferring the knowledge in the known-class data to the learning of novel classes.
We introduce a class relation representation for the novel classes based on the predicted class distribution of a model trained on known classes.
We propose a novel knowledge distillation framework, which utilizes our class-relation representation to regularize the learning of novel classes.
arXiv Detail & Related papers (2023-07-18T11:35:57Z) - Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle.
We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths.
Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Reinforcement Learning based Path Exploration for Sequential Explainable
Recommendation [57.67616822888859]
We propose a novel Temporal Meta-path Guided Explainable Recommendation leveraging Reinforcement Learning (TMER-RL)
TMER-RL utilizes reinforcement item-item path modelling between consecutive items with attention mechanisms to sequentially model dynamic user-item evolutions on dynamic knowledge graph for explainable recommendation.
Extensive evaluations of TMER on two real-world datasets show state-of-the-art performance compared against recent strong baselines.
arXiv Detail & Related papers (2021-11-24T04:34:26Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Learning From Multiple Experts: Self-paced Knowledge Distillation for
Long-tailed Classification [106.08067870620218]
We propose a self-paced knowledge distillation framework, termed Learning From Multiple Experts (LFME)
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
We conduct extensive experiments and demonstrate that our method is able to achieve superior performances compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-06T12:57:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.