Related papers: LOCA-R: Near-Perfect Performance on the Chinese Physics Olympiad 2025

LOCA-R: Near-Perfect Performance on the Chinese Physics Olympiad 2025

URL: http://arxiv.org/abs/2511.10515v1
Date: Fri, 14 Nov 2025 01:55:49 GMT
Title: LOCA-R: Near-Perfect Performance on the Chinese Physics Olympiad 2025
Authors: Dong-Shan Jian, Xiang Li, Chen-Xu Yan, Hui-Wen Zheng, Zhi-Zhang Bian, You-Le Fang, Sheng-Qi Zhang, Bing-Rui Gong, Ren-Xi He, Jing-Tian Zhang, Ce Meng, Yan-Qing Ma,
Abstract summary: We introduce LOCA-R (LOgical Chain Augmentation for Reasoning), an improved version of the LOCA framework adapted for complex reasoning.<n>LOCA-R achieves a near-perfect score of 313 out of 320 points, solidly surpassing the highest-scoring human competitor.
Score: 3.5580730009417016
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Olympiad-level physics problem-solving presents a significant challenge for both humans and artificial intelligence (AI), as it requires a sophisticated integration of precise calculation, abstract reasoning, and a fundamental grasp of physical principles. The Chinese Physics Olympiad (CPhO), renowned for its complexity and depth, serves as an ideal and rigorous testbed for these advanced capabilities. In this paper, we introduce LOCA-R (LOgical Chain Augmentation for Reasoning), an improved version of the LOCA framework adapted for complex reasoning, and apply it to the CPhO 2025 theory examination. LOCA-R achieves a near-perfect score of 313 out of 320 points, solidly surpassing the highest-scoring human competitor and significantly outperforming all baseline methods.

Related papers

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads [91.05736019384489]
We introduce P1-VL, a family of open-source vision-language models engineered for advanced scientific reasoning.<n>Our flagship P1-VL-235B-A22B becomes the first open-source Vision-Language Model to secure 12 gold medals and achieves the state-of-the-art performance in the open-source models.
arXiv Detail & Related papers (2026-02-10T06:28:08Z)
Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions [129.877899436804]
We present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference.<n>Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on International Mathematical Olympiad (IMO)<n>We further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels.
arXiv Detail & Related papers (2025-11-27T01:05:00Z)
P1: Mastering Physics Olympiads with Reinforcement Learning [84.08897284032724]
We introduce P1, a family of open-source physics reasoning models trained entirely through reinforcement learning (RL)<n>P1-235B-A22B is the first open-source model with Gold-medal performance at the latest International Physics Olympiad (IPhO 2025), and wins 12 gold medals out of 13 international/regional physics competitions in 2024/2025.<n>P1-235B-A22B+PhysicsMinions achieves overall No.1 on IPhO 2025, and obtains the highest average score over the 13 physics competitions.
arXiv Detail & Related papers (2025-11-17T17:18:13Z)
PhysicsMinions: Winning Gold Medals in the Latest Physics Olympiads with a Coevolutionary Multimodal Multi-Agent System [65.02248709992442]
Physics is central to understanding and shaping the real world, and the ability to solve physics problems is a key indicator of real-world physical intelligence.<n>Existing approaches are predominantly single-model based, and open-source MLLMs rarely reach gold-medal-level performance.<n>We propose PhysicsMinions, a coevolutionary multi-agent system for Physics Olympiad.<n>Its architecture features three synergistic studios: a Visual Studio to interpret diagrams, a Logic Studio to formulate solutions, and a Review Studio to perform dual-stage verification.
arXiv Detail & Related papers (2025-09-29T14:40:53Z)
HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark? [53.76627321546095]
HiPhO is the first benchmark dedicated to high school physics Olympiads with human-aligned evaluation.<n>It compiles 13 latest Olympiad exams from 2024-2025, spanning both international and regional competitions.<n>We assign gold, silver, and bronze medals to models based on official medal thresholds, thereby enabling direct comparison between (M)LLMs and human contestants.
arXiv Detail & Related papers (2025-09-09T16:24:51Z)
Physics Supernova: AI Agent Matches Elite Gold Medalists at IPhO 2025 [55.8464246603186]
We introduce Physics Supernova, an AI system with superior physics problem-solving abilities.<n>Supernova attains 23.5/30 points, ranking 14th of 406 contestants and surpassing the median performance of human gold medalists.<n>These results show that principled tool integration within agent systems can deliver competitive improvements.
arXiv Detail & Related papers (2025-09-01T17:59:13Z)
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models [63.31878920079154]
We propose a benchmark specifically designed to assess large language models' mathematical reasoning at the Olympiad level.<n>Unlike existing Olympiad-related benchmarks, our dataset focuses exclusively on mathematics and comprises a vast collection of 4428 competition-level problems with rigorous human annotation.<n>Our experimental results show that even the most advanced models, OpenAI o1-mini and OpenAI o1-preview, struggle with highly challenging Olympiad-level problems, with 60.54% and 52.55% accuracy, highlighting significant challenges in Olympiad-level mathematical reasoning.
arXiv Detail & Related papers (2024-10-10T14:39:33Z)
NeurIPS 2024 ML4CFD Competition: Harnessing Machine Learning for Computational Fluid Dynamics in Airfoil Design [15.301599529509057]
The challenge centers on a task fundamental to a well-established physical application: airfoil design simulation. This competition represents a pioneering effort in exploring ML-driven surrogate methods. The competition offers online training and evaluation for all participating solutions.
arXiv Detail & Related papers (2024-06-30T21:48:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.