Learning Task Decomposition to Assist Humans in Competitive Programming
- URL: http://arxiv.org/abs/2406.04604v3
- Date: Tue, 23 Jul 2024 18:26:32 GMT
- Title: Learning Task Decomposition to Assist Humans in Competitive Programming
- Authors: Jiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang,
- Abstract summary: We introduce a novel objective for learning task decomposition, termed value (AssistV)
We collect a dataset of human repair experiences on different decomposed solutions.
Under 177 hours of human study, our method enables non-experts to solve 33.3% more problems, speeds them up by 3.3x, and empowers them to match unassisted experts.
- Score: 90.4846613669734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When using language models (LMs) to solve complex problems, humans might struggle to understand the LM-generated solutions and repair the flawed ones. To assist humans in repairing them, we propose to automatically decompose complex solutions into multiple simpler pieces that correspond to specific subtasks. We introduce a novel objective for learning task decomposition, termed assistive value (AssistV), which measures the feasibility and speed for humans to repair the decomposed solution. We collect a dataset of human repair experiences on different decomposed solutions. Utilizing the collected data as in-context examples, we then learn to critique, refine, and rank decomposed solutions to improve AssistV. We validate our method under competitive programming problems: under 177 hours of human study, our method enables non-experts to solve 33.3\% more problems, speeds them up by 3.3x, and empowers them to match unassisted experts.
Related papers
- Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation [39.805610561281455]
Large Language Models (LLMs) demonstrate promising capabilities in solving simple scientific problems.
Human experts first assess problem complexity using domain knowledge before choosing an appropriate solution approach.
We propose a novel two-component fine-tuning method.
Our models demonstrate a 28.18% improvement in answer accuracy and a 13.89% increase in tool usage precision across all datasets.
arXiv Detail & Related papers (2024-11-01T07:18:31Z) - SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories [55.161075901665946]
Super aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) research repositories.
Our benchmark comprises three distinct problem sets: 45 end-to-end problems with annotated expert solutions, 152 sub problems derived from the expert set that focus on specific challenges, and 602 automatically generated problems for larger-scale development.
We show that state-of-the-art approaches struggle to solve these problems with the best model (GPT-4o) solving only 16.3% of the end-to-end set, and 46.1% of the scenarios.
arXiv Detail & Related papers (2024-09-11T17:37:48Z) - Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? [140.9751389452011]
We study the biases of large language models (LLMs) in relation to those known in children when solving arithmetic word problems.
We generate a novel set of word problems for each of these tests, using a neuro-symbolic approach that enables fine-grained control over the problem features.
arXiv Detail & Related papers (2024-01-31T18:48:20Z) - Optimising Human-AI Collaboration by Learning Convincing Explanations [62.81395661556852]
We propose a method for a collaborative system that remains safe by having a human making decisions.
Ardent enables efficient and effective decision-making by adapting to individual preferences for explanations.
arXiv Detail & Related papers (2023-11-13T16:00:16Z) - Learning by Grouping: A Multilevel Optimization Framework for Improving
Fairness in Classification without Losing Accuracy [19.84719054826755]
In some cases, AI systems can be unfair by exhibiting bias or discrimination against certain social groups.
We propose a novel machine learning framework where the ML model learns to group a diverse set of problems into distinct subgroups to solve each subgroup.
Our proposed framework involves three stages of learning, which are formulated as a three-level optimization problem.
arXiv Detail & Related papers (2023-04-02T08:45:08Z) - Planning and Scheduling in Digital Health with Answer Set Programming [0.0]
Problems in the healthcare are complex since to solve them several constraints and different type of resources should be taken into account.
We plan to propose solutions to these kind of problems both expanding already tested solutions and by modelling solutions for new problems.
arXiv Detail & Related papers (2022-08-05T10:51:02Z) - A Mutual Information Maximization Approach for the Spurious Solution
Problem in Weakly Supervised Question Answering [60.768146126094955]
Weakly supervised question answering usually has only the final answers as supervision signals.
There may exist many spurious solutions that coincidentally derive the correct answer, but training on such solutions can hurt model performance.
We propose to explicitly exploit such semantic correlations by maximizing the mutual information between question-answer pairs and predicted solutions.
arXiv Detail & Related papers (2021-06-14T05:47:41Z) - Reset-Free Reinforcement Learning via Multi-Task Learning: Learning
Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems.
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z) - Extending the Hint Factory for the assistance dilemma: A novel,
data-driven HelpNeed Predictor for proactive problem-solving help [6.188683567894372]
We present a set of data-driven methods to classify, predict, and prevent unproductive problem-solving steps.
We present a HelpNeed classification, that uses prior student data to determine when students are likely to be unproductive.
We conclude with suggestions on how these HelpNeed methods could be applied in other well-structured open-ended domains.
arXiv Detail & Related papers (2020-10-08T17:04:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.