Related papers: Meta learning with language models: Challenges and opportunities in the classification of imbalanced text

Meta learning with language models: Challenges and opportunities in the classification of imbalanced text

URL: http://arxiv.org/abs/2310.15019v2
Date: Tue, 24 Oct 2023 15:15:38 GMT
Title: Meta learning with language models: Challenges and opportunities in the classification of imbalanced text
Authors: Apostol Vassilev and Honglan Jin and Munawar Hasan
Abstract summary: We propose a meta learning technique (MLT) that combines individual models built with different text representations. We analytically show that the resulting technique is numerically stable and produces reasonable combining weights. We also provide computational results to show the statistically significant advantages of the proposed MLT approach.
Score: 0.8663897798518103
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Detecting out of policy speech (OOPS) content is important but difficult. While machine learning is a powerful tool to tackle this challenging task, it is hard to break the performance ceiling due to factors like quantity and quality limitations on training data and inconsistencies in OOPS definition and data labeling. To realize the full potential of available limited resources, we propose a meta learning technique (MLT) that combines individual models built with different text representations. We analytically show that the resulting technique is numerically stable and produces reasonable combining weights. We combine the MLT with a threshold-moving (TM) technique to further improve the performance of the combined predictor on highly-imbalanced in-distribution and out-of-distribution datasets. We also provide computational results to show the statistically significant advantages of the proposed MLT approach. All authors contributed equally to this work.

Related papers

SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
Does Prompt Design Impact Quality of Data Imputation by LLMs? [0.0]
This paper presents a novel token-aware data imputation method that leverages the in-context learning capabilities of large language models.<n>We test this approach with two class-imbalanced binary classification datasets and evaluate the effectiveness of imputation.
arXiv Detail & Related papers (2025-06-04T17:15:19Z)
Leveraging Large Language Models to Address Data Scarcity in Machine Learning: Applications in Graphene Synthesis [0.0]
Machine learning in materials science faces challenges due to limited experimental data. We propose strategies that utilize large language models (LLMs) to enhance machine learning performance.
arXiv Detail & Related papers (2025-03-06T16:04:01Z)
Multimodal Distillation-Driven Ensemble Learning for Long-Tailed Histopathology Whole Slide Images Analysis [16.01677300903562]
Multiple Instance Learning (MIL) plays a significant role in computational pathology, enabling weakly supervised analysis of Whole Slide Image (WSI) datasets. We propose an ensemble learning method based on MIL, which employs expert decoders with shared aggregators to learn diverse distributions. We introduce a multimodal distillation framework that leverages text encoders pre-trained on pathology-text pairs to distill knowledge. Our method, MDE-MIL, integrates multiple expert branches focusing on specific data distributions to address long-tailed issues.
arXiv Detail & Related papers (2025-03-02T14:31:45Z)
Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks. We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
PAL: Prompting Analytic Learning with Missing Modality for Multi-Modal Class-Incremental Learning [42.00851701431368]
Multi-modal class-incremental learning (MMCIL) seeks to leverage multi-modal data, such as audio-visual and image-text pairs. A critical challenge remains: the issue of missing modalities during incremental learning phases. We propose PAL, a novel exemplar-free framework tailored to MMCIL under missing-modality scenarios.
arXiv Detail & Related papers (2025-01-16T08:04:04Z)
How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples. We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics. When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z)
Semantic Meta-Split Learning: A TinyML Scheme for Few-Shot Wireless Image Classification [50.28867343337997]
This work presents a TinyML-based semantic communication framework for few-shot wireless image classification. We exploit split-learning to limit the computations performed by the end-users while ensuring privacy-preserving. meta-learning overcomes data availability concerns and speeds up training by utilizing similarly trained tasks.
arXiv Detail & Related papers (2024-09-03T05:56:55Z)
Text Serialization and Their Relationship with the Conventional Paradigms of Tabular Machine Learning [0.0]
This study explores how Language Models (LMs) can be used for feature representation and prediction in machine learning tasks. Our study assesses how emerging LM technologies compare with traditional paradigms in tabular machine learning. Our findings reveal current pre-trained models should not replace conventional approaches.
arXiv Detail & Related papers (2024-06-19T21:19:37Z)
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models. Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z)
Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling [0.0]
This paper presents a novel hybrid approach that synergizes unsupervised and supervised learning to improve the accuracy of NLP task modeling. Our methodology integrates an unsupervised module that learns representations from unlabeled corpora and a supervised module that leverages these representations to enhance task-specific models. By synergizing techniques, our hybrid approach achieves SOTA results on benchmark datasets, paving the way for more data-efficient and robust NLP systems.
arXiv Detail & Related papers (2024-06-03T08:31:35Z)
CLAIM Your Data: Enhancing Imputation Accuracy with Contextual Large Language Models [0.18416014644193068]
This paper introduces the Contextual Language model for Accurate Imputation Method (CLAIM) Unlike traditional imputation methods, CLAIM utilizes contextually relevant natural language descriptors to fill missing values. Our evaluations across diverse datasets and missingness patterns reveal CLAIM's superior performance over existing imputation techniques.
arXiv Detail & Related papers (2024-05-28T00:08:29Z)
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition [67.96794382040547]
$LLM-DA$ is a novel data augmentation technique based on large language models (LLMs) for the few-shot NER task. Our approach involves employing 14 contextual rewriting strategies, designing entity replacements of the same type, and incorporating noise injection to enhance robustness.
arXiv Detail & Related papers (2024-02-22T14:19:56Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)
Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
This paper surveys research works in the quickly advancing field of instruction tuning (IT) In this paper, unless specified otherwise, instruction tuning (IT) will be equivalent to supervised fine-tuning (SFT)
arXiv Detail & Related papers (2023-08-21T15:35:16Z)
Mixture of Soft Prompts for Controllable Data Generation [21.84489422361048]
Mixture of Soft Prompts (MSP) is proposed as a tool for data augmentation rather than direct prediction. Our method achieves state-of-the-art results on three benchmarks when compared against strong baselines.
arXiv Detail & Related papers (2023-03-02T21:13:56Z)
Exploring Dimensionality Reduction Techniques in Multilingual Transformers [64.78260098263489]
This paper gives a comprehensive account of the impact of dimensional reduction techniques on the performance of state-of-the-art multilingual Siamese Transformers. It shows that it is possible to achieve an average reduction in the number of dimensions of $91.58% pm 2.59%$ and $54.65% pm 32.20%$, respectively.
arXiv Detail & Related papers (2022-04-18T17:20:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.