Estimating Difficulty Levels of Programming Problems with Pre-trained Model
- URL: http://arxiv.org/abs/2406.08828v1
- Date: Thu, 13 Jun 2024 05:38:20 GMT
- Title: Estimating Difficulty Levels of Programming Problems with Pre-trained Model
- Authors: Zhiyuan Wang, Wei Zhang, Jun Wang,
- Abstract summary: The difficulty level of each programming problem serves as an essential reference for guiding students' adaptive learning.
We formulate the problem of automatic difficulty level estimation of each programming problem, given its textual description and a solution example of code.
For tackling this problem, we propose to couple two pre-trained models, one for text modality and the other for code modality, into a unified model.
- Score: 18.92661958433282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the demand for programming skills grows across industries and academia, students often turn to Programming Online Judge (POJ) platforms for coding practice and competition. The difficulty level of each programming problem serves as an essential reference for guiding students' adaptive learning. However, current methods of determining difficulty levels either require extensive expert annotations or take a long time to accumulate enough student solutions for each problem. To address this issue, we formulate the problem of automatic difficulty level estimation of each programming problem, given its textual description and a solution example of code. For tackling this problem, we propose to couple two pre-trained models, one for text modality and the other for code modality, into a unified model. We built two POJ datasets for the task and the results demonstrate the effectiveness of the proposed approach and the contributions of both modalities.
Related papers
- Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization [126.27645170941268]
We present Easy2Hard-Bench, a collection of 6 benchmark datasets spanning various domains.
Each problem within these datasets is annotated with numerical difficulty scores.
We provide a comprehensive analysis of their performance and generalization capabilities across varying levels of difficulty.
arXiv Detail & Related papers (2024-09-27T03:49:56Z) - Learning Task Decomposition to Assist Humans in Competitive Programming [90.4846613669734]
We introduce a novel objective for learning task decomposition, termed value (AssistV)
We collect a dataset of human repair experiences on different decomposed solutions.
Under 177 hours of human study, our method enables non-experts to solve 33.3% more problems, speeds them up by 3.3x, and empowers them to match unassisted experts.
arXiv Detail & Related papers (2024-06-07T03:27:51Z) - Distilling Algorithmic Reasoning from LLMs via Explaining Solution Programs [2.3020018305241337]
Distilling explicit chain-of-thought reasoning paths has emerged as an effective method for improving the reasoning abilities of large language models.
We propose a novel approach to distill reasoning abilities from LLMs by leveraging their capacity to explain solutions.
Our experiments demonstrate that learning from explanations enables the Reasoner to more effectively guide program implementation by a Coder.
arXiv Detail & Related papers (2024-04-11T22:19:50Z) - PPM: Automated Generation of Diverse Programming Problems for
Benchmarking Code Generation Models [10.491051578439722]
We propose the idea of programming problem merging (PPM) and provide two implementation of this idea, we utilize our tool on two widely-used datasets.
The results demonstrate the effectiveness of our tool in generating more challenging, diverse, and natural programming problems.
arXiv Detail & Related papers (2024-01-28T02:27:38Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - ACES: Generating Diverse Programming Puzzles with with Autotelic Generative Models [20.039580079339537]
Autotelic CodE Search (ACES) jointly optimize for the diversity and difficulty of generated problems.
We represent problems in a space of semantic descriptors describing the programming skills required to solve them.
ACES iteratively prompts a large language model to generate difficult problems achieving a diversity of target semantic descriptors.
arXiv Detail & Related papers (2023-10-15T14:57:14Z) - Tag Prediction of Competitive Programming Problems using Deep Learning
Techniques [0.0]
A well-liked method for developing programming abilities is competitive programming.
It can be tough for novices and even veteran programmers to traverse the wide collection of questions.
This can be done using automated tagging of the questions using Text Classification.
arXiv Detail & Related papers (2023-08-03T16:39:02Z) - Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning [10.889271604723312]
Chain-of-thought (CoT) prompting with large language models has proven effective in numerous natural language processing tasks.
We investigate two approaches to leverage the training data in a few-shot prompting scenario: dynamic program prompting and program distillation.
Our experiments on three standard math word problem (MWP) datasets demonstrate the effectiveness of these approaches.
arXiv Detail & Related papers (2023-05-29T16:01:40Z) - Towards a Holistic Understanding of Mathematical Questions with
Contrastive Pre-training [65.10741459705739]
We propose a novel contrastive pre-training approach for mathematical question representations, namely QuesCo.
We first design two-level question augmentations, including content-level and structure-level, which generate literally diverse question pairs with similar purposes.
Then, to fully exploit hierarchical information of knowledge concepts, we propose a knowledge hierarchy-aware rank strategy.
arXiv Detail & Related papers (2023-01-18T14:23:29Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z) - Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation.
Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges.
Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.