Related papers: OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

URL: http://arxiv.org/abs/2410.09671v1
Date: Sat, 12 Oct 2024 23:42:16 GMT
Title: OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Authors: Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni, Linyi Yang, Ying Wen, Weinan Zhang,
Abstract summary: We introduce OpenR, an open-source framework for enhancing the reasoning capabilities of large language models (LLMs) OpenR unifies data acquisition, reinforcement learning training, and non-autoregressive decoding into a cohesive software platform. Our work is the first to provide an open-source framework that explores the core techniques of OpenAI's o1 model with reinforcement learning.
Score: 61.14336781917986
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this technical report, we introduce OpenR, an open-source framework designed to integrate key components for enhancing the reasoning capabilities of large language models (LLMs). OpenR unifies data acquisition, reinforcement learning training (both online and offline), and non-autoregressive decoding into a cohesive software platform. Our goal is to establish an open-source platform and community to accelerate the development of LLM reasoning. Inspired by the success of OpenAI's o1 model, which demonstrated improved reasoning abilities through step-by-step reasoning and reinforcement learning, OpenR integrates test-time compute, reinforcement learning, and process supervision to improve reasoning in LLMs. Our work is the first to provide an open-source framework that explores the core techniques of OpenAI's o1 model with reinforcement learning, achieving advanced reasoning capabilities beyond traditional autoregressive methods. We demonstrate the efficacy of OpenR by evaluating it on the MATH dataset, utilising publicly available data and search methods. Our initial experiments confirm substantial gains, with relative improvements in reasoning and performance driven by test-time computation and reinforcement learning through process reward models. The OpenR framework, including code, models, and datasets, is accessible at https://openreasoner.github.io.

Related papers

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe [69.90298686714036]
We introduce OpenMMReasoner, a fully transparent two-stage recipe for multimodal reasoning spanning fine-tuning and reinforcement learning.<n>Our method achieves a 11.6% improvement over the Qwen2.5-VL-7B-Instruct baseline across nine multimodal reasoning benchmarks.
arXiv Detail & Related papers (2025-11-20T13:11:45Z)
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use [78.29315418819074]
We introduce VerlTool, a unified and modular framework that addresses limitations through systematic design principles.<n>Our framework formalizes ARLT as multi-turn trajectories with multi-modal observation tokens (text/image/video), extending beyond single-turn RLVR paradigms.<n>The modular plugin architecture enables rapid tool integration requiring only lightweight Python definitions.
arXiv Detail & Related papers (2025-09-01T01:45:18Z)
GenKI: Enhancing Open-Domain Question Answering with Knowledge Integration and Controllable Generation in Large Language Models [75.25348392263676]
Open-domain question answering (OpenQA) represents a cornerstone in natural language processing (NLP)<n>We propose a novel framework named GenKI, which aims to improve the OpenQA performance by exploring Knowledge Integration and controllable Generation.
arXiv Detail & Related papers (2025-05-26T08:18:33Z)
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models. Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start. Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z)
START: Self-taught Reasoner with Tools [51.38785489790888]
We introduce START (Self-Taught Reasoner with Tools), a tool-integrated long Chain-of-thought (CoT) reasoning LLM. START is capable of performing complex computations, self-checking, exploring diverse methods, and self-ging. It significantly outperforms the base QwQ-32B and achieves performance comparable to the state-of-the-art open-weight model R1-Distill-Qwen-32B.
arXiv Detail & Related papers (2025-03-06T17:11:51Z)
CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning [0.7009487789080343]
Chain-of-Associated-Thoughts (CoAT) framework is inspired by the human ability to constantly associate and replenish knowledge during thinking.<n>By combining the structured exploration capabilities of MCTS with the adaptive learning capacity of associative memory, CoAT significantly expands the LLM search space.<n>CoAT achieves over 10% performance improvement on open-source multi-hop reasoning datasets.
arXiv Detail & Related papers (2025-02-04T15:10:33Z)
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective [77.94874338927492]
OpenAI has claimed that the main techinique behinds o1 is the reinforcement learning. This paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning.
arXiv Detail & Related papers (2024-12-18T18:24:47Z)
7B Fully Open Source Moxin-LLM -- From Pretraining to GRPO-based Reinforcement Learning Enhancement [42.10844666788254]
Moxin 7B is a fully open-source Large Language Models (LLMs) developed adhering to principles of open science, open source, open data, and open access. We release the pre-training code and configurations, training and fine-tuning datasets, and intermediate and final checkpoints. Experiments show that our models achieve superior performance in various evaluations such as zero-shot evaluation, few-shot evaluation, and CoT evaluation.
arXiv Detail & Related papers (2024-12-08T02:01:46Z)
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems. While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited. We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z)
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models [23.68266151581951]
Retrieval-Augmented Generation (RAG) has been shown to enhance the factual accuracy of Large Language Models (LLMs) Existing methods often suffer from limited reasoning capabilities in effectively using the retrieved evidence. We introduce a novel framework, Open-RAG, designed to enhance reasoning capabilities in RAG with open-source LLMs.
arXiv Detail & Related papers (2024-10-02T17:37:18Z)
A Unified Framework for Neural Computation and Learning Over Time [56.44910327178975]
Hamiltonian Learning is a novel unified framework for learning with neural networks "over time" It is based on differential equations that: (i) can be integrated without the need of external software solvers; (ii) generalize the well-established notion of gradient-based learning in feed-forward and recurrent networks; (iii) open to novel perspectives.
arXiv Detail & Related papers (2024-09-18T14:57:13Z)
Intelligence Analysis of Language Models [0.0]
We test the effectiveness of Large Language Models (LLMs) on the Abstraction and Reasoning Corpus (ARC) dataset. This dataset serves as a representative benchmark for testing abstract reasoning abilities. We investigate the application of the Chain-of-Thought (CoT) technique, aiming to determine its role in improving model performance.
arXiv Detail & Related papers (2024-07-20T13:48:16Z)
ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs [60.81649785463651]
We introduce ExaRanker-Open, where we adapt and explore the use of open-source language models to generate explanations. Our findings reveal that incorporating explanations consistently enhances neural rankers, with benefits escalating as the LLM size increases.
arXiv Detail & Related papers (2024-02-09T11:23:14Z)
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark [81.42376626294812]
We present Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark. Our aim is to establish LAMM as a growing ecosystem for training and evaluating MLLMs. We present a comprehensive dataset and benchmark, which cover a wide range of vision tasks for 2D and 3D vision.
arXiv Detail & Related papers (2023-06-11T14:01:17Z)
MORE: A Metric Learning Based Framework for Open-domain Relation Extraction [25.149590577718996]
Open relation extraction (OpenRE) is the task of extracting relation schemes from open-domain corpora. We propose a novel learning framework named MORE (Metric learning-based Open Relation Extraction)
arXiv Detail & Related papers (2022-06-01T07:51:20Z)
OpenFL: An open-source framework for Federated Learning [41.03632020180591]
Federated learning (FL) is a computational paradigm that enables organizations to collaborate on machine learning (ML) projects without sharing sensitive data. OpenFL is an open-source framework for training ML algorithms using the data-private collaborative learning paradigm of FL.
arXiv Detail & Related papers (2021-05-13T16:40:19Z)
Reinforcement Learning with Augmented Data [97.42819506719191]
We present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms. We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods.
arXiv Detail & Related papers (2020-04-30T17:35:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.