Few-shot learning approaches for classifying low resource domain
specific software requirements
- URL: http://arxiv.org/abs/2302.06951v1
- Date: Tue, 14 Feb 2023 10:19:23 GMT
- Title: Few-shot learning approaches for classifying low resource domain
specific software requirements
- Authors: Anmol Nayak, Hari Prasad Timmapathini, Vidhya Murali, Atul Anil Gohad
- Abstract summary: Few-shot learning is a type of deep learning that uses only a few annotated samples.
Our experiments focus on classifying BOSCH automotive domain textual software requirements into 3 categories.
While SciBERT and DeBERTa based models tend to be the most accurate at 15 training samples, their performance improvement scales minimally as the number of annotated samples is increased to 50 in comparison to Siamese and T5 based models.
- Score: 1.1470070927586016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the advent of strong pre-trained natural language processing models like
BERT, DeBERTa, MiniLM, T5, the data requirement for industries to fine-tune
these models to their niche use cases has drastically reduced (typically to a
few hundred annotated samples for achieving a reasonable performance). However,
the availability of even a few hundred annotated samples may not always be
guaranteed in low resource domains like automotive, which often limits the
usage of such deep learning models in an industrial setting. In this paper we
aim to address the challenge of fine-tuning such pre-trained models with only a
few annotated samples, also known as Few-shot learning. Our experiments focus
on evaluating the performance of a diverse set of algorithms and methodologies
to achieve the task of classifying BOSCH automotive domain textual software
requirements into 3 categories, while utilizing only 15 annotated samples per
category for fine-tuning. We find that while SciBERT and DeBERTa based models
tend to be the most accurate at 15 training samples, their performance
improvement scales minimally as the number of annotated samples is increased to
50 in comparison to Siamese and T5 based models.
Related papers
- LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content [62.816876067499415]
We propose LiveXiv: a scalable evolving live benchmark based on scientific ArXiv papers.
LiveXiv accesses domain-specific manuscripts at any given timestamp and proposes to automatically generate visual question-answer pairs.
We benchmark multiple open and proprietary Large Multi-modal Models (LMMs) on the first version of our benchmark, showing its challenging nature and exposing the models true abilities.
arXiv Detail & Related papers (2024-10-14T17:51:23Z) - It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives [40.197673152937256]
Training of statistical performance models often requires vast amounts of data, leading to a significant time investment and can be difficult in case of limited hardware availability.
We propose a novel performance modeling methodology that significantly reduces the number of training samples while maintaining good accuracy.
We achieve a Mean Absolute Percentage Error (MAPE) of as low as 0.02% for single-layer estimations and 0.68% for whole estimations with less than 10000 training samples.
arXiv Detail & Related papers (2024-06-12T15:34:28Z) - No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance [68.18779562801762]
multimodal models require exponentially more data to achieve linear improvements in downstream "zero-shot" performance.
Our study reveals an exponential need for training data which implies that the key to "zero-shot" generalization capabilities under large-scale training paradigms remains to be found.
arXiv Detail & Related papers (2024-04-04T17:58:02Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms [2.249916681499244]
We finetune open-source MPT-7B and MPT-30B models on instruction finetuning datasets of various sizes ranging from 1k to 60k samples.
We find that subsets of 1k-6k instruction finetuning samples are sufficient to achieve good performance on both (1) traditional NLP benchmarks and (2) model-based evaluation.
arXiv Detail & Related papers (2023-11-22T03:37:01Z) - Robust Fine-Tuning of Vision-Language Models for Domain Generalization [6.7181844004432385]
Foundation models have impressive zero-shot inference capabilities and robustness under distribution shifts.
We present a new recipe for few-shot fine-tuning of the popular vision-language foundation model CLIP.
Our experimentation demonstrates that, while zero-shot CLIP fails to match performance of trained vision models on more complex benchmarks, few-shot CLIP fine-tuning outperforms its vision-only counterparts.
arXiv Detail & Related papers (2023-11-03T20:50:40Z) - Few-shot Instruction Prompts for Pretrained Language Models to Detect
Social Biases [55.45617404586874]
We propose a few-shot instruction-based method for prompting pre-trained language models (LMs)
We show that large LMs can detect different types of fine-grained biases with similar and sometimes superior accuracy to fine-tuned models.
arXiv Detail & Related papers (2021-12-15T04:19:52Z) - Low-Shot Validation: Active Importance Sampling for Estimating
Classifier Performance on Rare Categories [47.050853657721596]
For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs.
We propose a statistical validation algorithm that accurately estimates the F-score of binary classifiers for rare categories.
In particular, we can estimate model F1 scores with a variance of 0.005 using as few as 100 labels.
arXiv Detail & Related papers (2021-09-13T06:01:16Z) - SE3M: A Model for Software Effort Estimation Using Pre-trained Embedding
Models [0.8287206589886881]
This paper proposes to evaluate the effectiveness of pre-trained embeddings models.
Generic pre-trained models for both approaches went through a fine-tuning process.
Results were very promising, realizing that pre-trained models can be used to estimate software effort based only on requirements texts.
arXiv Detail & Related papers (2020-06-30T14:15:38Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.