Related papers: TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks

TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks

URL: http://arxiv.org/abs/2409.20189v1
Date: Mon, 30 Sep 2024 11:04:56 GMT
Title: TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks
Authors: Areeg Fahad Rasheed, M. Zarkoosh, Safa F. Abbas, Sana Sabah Al-Azzawi,
Abstract summary: This paper addresses the challenge of classifying and assigning programming tasks to experts. A novel dataset containing a total of 4,112 programming tasks was created by extracting tasks from various websites. Web scraping techniques were employed to collect this dataset of programming problems systematically.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost. To tackle this issue, a novel dataset containing a total of 4,112 programming tasks was created by extracting tasks from various websites. Web scraping techniques were employed to collect this dataset of programming problems systematically. Specific HTML tags were tracked to extract key elements of each issue, including the title, problem description, input-output, examples, problem class, and complexity score. Examples from the dataset are provided in the appendix to illustrate the variety and complexity of tasks included. The dataset's effectiveness has been evaluated and benchmarked using two approaches; the first approach involved fine-tuning the FLAN-T5 small model on the dataset, while the second approach used in-context learning (ICL) with the GPT-4o mini. The performance was assessed using standard metrics: accuracy, recall, precision, and F1-score. The results indicated that in-context learning with GPT-4o-mini outperformed the FLAN-T5 model.

Related papers

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning [5.963754140027611]
Token-Selective HIeRarchical Data Selection for Instruction Tuning (T-SHIRT) is a novel data selection framework.<n>We demonstrate that models instruction-tuned on a curated dataset can outperform those trained on the entire large-scale dataset.
arXiv Detail & Related papers (2025-06-02T04:59:17Z)
QUAD-LLM-MLTC: Large Language Models Ensemble Learning for Healthcare Text Multi-Label Classification [4.8342038441006805]
The escalating volume of collected healthcare textual data presents a unique challenge for automated Text Classification. Traditional machine learning models often fail to fully capture the array of expressed topics. Large Language Models (LLMs) have demonstrated remarkable effectiveness across numerous Natural Language Processing (NLP) tasks.
arXiv Detail & Related papers (2025-02-20T01:46:12Z)
Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information [11.545160697026514]
We propose a metric of task relatedness based on task difficulty measured by pointwise V-usable information (PVI) We conduct experiments to evaluate the feasibility of this metric for task grouping on 15 NLP datasets in the general, biomedical, and clinical domains.
arXiv Detail & Related papers (2024-10-16T17:49:45Z)
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation [51.2289822267563]
We propose Corpus Retrieval and Augmentation for Fine-Tuning (CRAFT), a method for generating synthetic datasets. We use large-scale public web-crawled corpora and similarity-based document retrieval to find other relevant human-written documents. We demonstrate that CRAFT can efficiently generate large-scale task-specific training datasets for four diverse tasks.
arXiv Detail & Related papers (2024-09-03T17:54:40Z)
5W1H Extraction With Large Language Models [27.409473072672277]
The extraction of essential news elements through the 5W1H framework is critical for event extraction and text summarization. ChatGPT has encountered challenges in processing longer news texts and analyzing specific attributes in context. We design several strategies from zero-shot/few-shot prompting to efficient fine-tuning to conduct 5W1H aspects extraction from the original news documents.
arXiv Detail & Related papers (2024-05-25T09:42:58Z)
Limits of Transformer Language Models on Learning to Compose Algorithms [77.2443883991608]
We evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. Our results indicate that compositional learning in state-of-the-art Transformer language models is highly sample inefficient.
arXiv Detail & Related papers (2024-02-08T16:23:29Z)
TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data [73.29220562541204]
We consider harnessing the amazing power of language models (LLMs) to solve our task. We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z)
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training. In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk. In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z)
Zero-shot Item-based Recommendation via Multi-task Product Knowledge Graph Pre-Training [106.85813323510783]
This paper presents a novel paradigm for the Zero-Shot Item-based Recommendation (ZSIR) task. It pre-trains a model on product knowledge graph (PKG) to refine the item features from PLMs. We identify three challenges for pre-training PKG, which are multi-type relations in PKG, semantic divergence between item generic information and relations and domain discrepancy from PKG to downstream ZSIR task.
arXiv Detail & Related papers (2023-05-12T17:38:24Z)
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder. We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization. We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z)
Multi-Task Meta Learning: learn how to adapt to unseen tasks [4.287114092271669]
This work proposes Multi-task Meta Learning (MTML), integrating two learning paradigms Multi-Task Learning (MTL) and meta learning. The fundamental idea is to train a multi-task model, such that when an unseen task is introduced, it can learn in fewer steps whilst offering a performance at least as good as conventional single task learning. MTML achieves state-of-the-art results for three out of four tasks for the NYU-v2 dataset and two out of four for the taskonomy dataset.
arXiv Detail & Related papers (2022-10-13T12:59:54Z)
Using Self-Supervised Pretext Tasks for Active Learning [7.214674613451605]
We propose a novel active learning approach that utilizes self-supervised pretext tasks and a unique data sampler to select data that are both difficult and representative. The pretext task learner is trained on the unlabeled set, and the unlabeled data are sorted and grouped into batches by their pretext task losses. In each iteration, the main task model is used to sample the most uncertain data in a batch to be annotated.
arXiv Detail & Related papers (2022-01-19T07:58:06Z)
Learning from Task Descriptions [24.588252048132862]
We introduce a framework for developing NLP systems that solve new tasks after reading their descriptions. We instantiate this framework with a new English language dataset, ZEST, structured for task-oriented evaluation. We find that the state-of-the-art T5 model achieves a score of 12% on ZEST, leaving a significant challenge for NLP researchers.
arXiv Detail & Related papers (2020-11-16T17:25:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.