MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks
- URL: http://arxiv.org/abs/2312.15960v3
- Date: Thu, 22 Aug 2024 06:24:12 GMT
- Title: MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks
- Authors: Jingyao Li, Pengguang Chen, Bin Xia, Hong Xu, Jiaya Jia,
- Abstract summary: We introduce a pioneering framework for MoT instruction tuning, designed to promote the decomposition of tasks into logical sub-tasks and sub-modules.
Our investigations reveal that, through the cultivation and utilization of sub-modules, MoTCoder significantly improves both the modularity and correctness of the generated solutions.
- Score: 50.61968901704187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks. However, their performance tends to falter when confronted with more challenging programming problems. We observe that conventional models often generate solutions as monolithic code blocks, restricting their effectiveness in tackling intricate questions. To overcome this limitation, we present Modular-of-Thought Coder (MoTCoder). We introduce a pioneering framework for MoT instruction tuning, designed to promote the decomposition of tasks into logical sub-tasks and sub-modules. Our investigations reveal that, through the cultivation and utilization of sub-modules, MoTCoder significantly improves both the modularity and correctness of the generated solutions, leading to substantial relative pass@1 improvements of 12.9% on APPS and 9.43% on CodeContests. Our codes are available at https://github.com/dvlab-research/MoTCoder.
Related papers
- Revisiting the Impact of Pursuing Modularity for Code Generation [4.177310099979434]
We assess the impact of modularity in code generation by introducing a novel metric for its quantitative measurement.
Surprisingly, unlike conventional wisdom on the topic, we find that modularity is not a core factor for improving the performance of code generation models.
arXiv Detail & Related papers (2024-07-16T05:48:24Z) - Performance-Aligned LLMs for Generating Fast Code [2.180216161965907]
We introduce a reinforcement learning based methodology to align the outputs of code LLMs with performance.
We demonstrate that our fine-tuned model improves the expected speedup of generated code over base models for a set of benchmark tasks.
arXiv Detail & Related papers (2024-04-29T16:52:38Z) - Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation [59.37775534633868]
We present an extremely straightforward approach to transferring pre-trained, task-specific PEFT modules between same-family PLMs.
We also propose a method that allows the transfer of modules between incompatible PLMs without any change in the inference complexity.
arXiv Detail & Related papers (2024-03-27T17:50:00Z) - Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs [36.409470894115074]
We propose the CodePLAN framework, which aims to transfer LLMs' code generation reasoning capabilities to smaller models.
Our approach improves the smaller model's code generation performance by over 130% on the challenging APPS benchmark.
arXiv Detail & Related papers (2024-03-20T03:09:54Z) - GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and
reusing ModulEs [64.49176353858792]
We propose generative neuro-symbolic visual reasoning by growing and reusing modules.
The proposed model performs competitively on standard tasks like visual question answering and referring expression comprehension.
It is able to adapt to new visual reasoning tasks by observing a few training examples and reusing modules.
arXiv Detail & Related papers (2023-11-08T18:59:05Z) - CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules [51.82044734879657]
We propose CodeChain, a novel framework for inference that elicits modularized code generation through a chain of self-revisions.
We find that CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests.
arXiv Detail & Related papers (2023-10-13T10:17:48Z) - ModuleFormer: Modularity Emerges from Mixture-of-Experts [60.6148988099284]
This paper proposes a new neural network architecture, ModuleFormer, to improve the efficiency and flexibility of large language models.
Unlike the previous SMoE-based modular language model, ModuleFormer can induce modularity from uncurated data.
arXiv Detail & Related papers (2023-06-07T17:59:57Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.