CLASS: Enhancing Cross-Modal Text-Molecule Retrieval Performance and Training Efficiency
- URL: http://arxiv.org/abs/2502.11633v1
- Date: Mon, 17 Feb 2025 10:24:07 GMT
- Title: CLASS: Enhancing Cross-Modal Text-Molecule Retrieval Performance and Training Efficiency
- Authors: Hongyan Wu, Peijian Zeng, Weixiong Zheng, Lianxi Wang, Nankai Lin, Shengyi Jiang, Aimin Yang,
- Abstract summary: Cross-modal text-molecule retrieval task bridges molecule structures and natural language descriptions.
Existing methods predominantly focus on aligning text modality and molecule modality, yet they overlook adaptively adjusting the learning states at different training stages.
This paper proposes a Curriculum Learning-bAsed croSS-modal text-molecule training framework (CLASS), which can be integrated with any backbone to yield promising performance improvement.
- Score: 7.2360149365370345
- License:
- Abstract: Cross-modal text-molecule retrieval task bridges molecule structures and natural language descriptions. Existing methods predominantly focus on aligning text modality and molecule modality, yet they overlook adaptively adjusting the learning states at different training stages and enhancing training efficiency. To tackle these challenges, this paper proposes a Curriculum Learning-bAsed croSS-modal text-molecule training framework (CLASS), which can be integrated with any backbone to yield promising performance improvement. Specifically, we quantify the sample difficulty considering both text modality and molecule modality, and design a sample scheduler to introduce training samples via an easy-to-difficult paradigm as the training advances, remarkably reducing the scale of training samples at the early stage of training and improving training efficiency. Moreover, we introduce adaptive intensity learning to increase the training intensity as the training progresses, which adaptively controls the learning intensity across all curriculum stages. Experimental results on the ChEBI-20 dataset demonstrate that our proposed method gains superior performance, simultaneously achieving prominent time savings.
Related papers
- Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining [55.262510814326035]
Existing reweighting strategies primarily focus on group-level data importance.
We introduce novel algorithms for dynamic, instance-level data reweighting.
Our framework allows us to devise reweighting strategies deprioritizing redundant or uninformative data.
arXiv Detail & Related papers (2025-02-10T17:57:15Z) - On the Effectiveness of Incremental Training of Large Language Models [10.39475177812483]
We investigate the effectiveness of incremental training for large language models.
We find that incremental layer-wise training may not be a viable alternative for training large language models.
arXiv Detail & Related papers (2024-11-27T19:11:49Z) - EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training [79.96741042766524]
We reformulate the training curriculum as a soft-selection function.
We show that exposing the contents of natural images can be readily achieved by the intensity of data augmentation.
The resulting method, EfficientTrain++, is simple, general, yet surprisingly effective.
arXiv Detail & Related papers (2024-05-14T17:00:43Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Advancing NLP Models with Strategic Text Augmentation: A Comprehensive
Study of Augmentation Methods and Curriculum Strategies [0.0]
This study conducts a thorough evaluation of text augmentation techniques across a variety of datasets and natural language processing (NLP) tasks.
It examines the effectiveness of these techniques in augmenting training sets to improve performance in tasks such as topic classification, sentiment analysis, and offensive language detection.
arXiv Detail & Related papers (2024-02-14T12:41:09Z) - Improving Imbalanced Text Classification with Dynamic Curriculum
Learning [32.731900584216724]
We propose a novel self-paced dynamic curriculum learning method for imbalanced text classification.
Our SPDCL can reorder and resample training data by difficulty criterion with an adaptive from easy to hard pace.
The experiments on several classification tasks show the effectiveness of SPDCL strategy, especially for the imbalanced dataset.
arXiv Detail & Related papers (2022-10-25T07:57:59Z) - Effective Vision Transformer Training: A Data-Centric Perspective [24.02488085447691]
Vision Transformers (ViTs) have shown promising performance compared with Convolutional Neural Networks (CNNs)
In this paper, we define several metrics, including Dynamic Data Proportion (DDP) and Knowledge Assimilation Rate (KAR)
We propose a novel data-centric ViT training framework to dynamically measure the difficulty'' of training samples and generate effective'' samples for models at different training stages.
arXiv Detail & Related papers (2022-09-29T17:59:46Z) - Perceiving the World: Question-guided Reinforcement Learning for
Text-based Games [64.11746320061965]
This paper introduces world-perceiving modules, which automatically decompose tasks and prune actions by answering questions about the environment.
We then propose a two-phase training framework to decouple language learning from reinforcement learning, which further improves the sample efficiency.
arXiv Detail & Related papers (2022-03-20T04:23:57Z) - Friendly Training: Neural Networks Can Adapt Data To Make Learning
Easier [23.886422706697882]
We propose a novel training procedure named Friendly Training.
We show that Friendly Training yields improvements with respect to informed data sub-selection and random selection.
Results suggest that adapting the input data is a feasible way to stabilize learning and improve the skills generalization of the network.
arXiv Detail & Related papers (2021-06-21T10:50:34Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Self-Paced Learning for Neural Machine Translation [55.41314278859938]
We propose self-paced learning for neural machine translation (NMT) training.
We show that the proposed model yields better performance than strong baselines.
arXiv Detail & Related papers (2020-10-09T11:33:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.