AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph biased evolution
- URL: http://arxiv.org/abs/2602.11917v1
- Date: Thu, 12 Feb 2026 13:14:58 GMT
- Title: AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph biased evolution
- Authors: Taian Guo, Haiyang Shen, Junyu Luo, Binqi Chen, Hongjun Ding, Jinsheng Huang, Luchen Liu, Yun Ma, Ming Zhang,
- Abstract summary: We introduce AlphaPROBE, a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG)<n>By modeling factors as nodes and evolutionary links as edges, AlphaPROBE treats the factor pool as a dynamic, interconnected ecosystem.<n>Our results confirm that leveraging global evolutionary topology is essential for efficient and robust automated alpha discovery.
- Score: 11.490182876149062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extracting signals through alpha factor mining is a fundamental challenge in quantitative finance. Existing automated methods primarily follow two paradigms: Decoupled Factor Generation, which treats factor discovery as isolated events, and Iterative Factor Evolution, which focuses on local parent-child refinements. However, both paradigms lack a global structural view, often treating factor pools as unstructured collections or fragmented chains, which leads to redundant search and limited diversity. To address these limitations, we introduce AlphaPROBE (Alpha Mining via Principled Retrieval and On-graph Biased Evolution), a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG). By modeling factors as nodes and evolutionary links as edges, AlphaPROBE treats the factor pool as a dynamic, interconnected ecosystem. The framework consists of two core components: a Bayesian Factor Retriever that identifies high-potential seeds by balancing exploitation and exploration through a posterior probability model, and a DAG-aware Factor Generator that leverages the full ancestral trace of factors to produce context-aware, nonredundant optimizations. Extensive experiments on three major Chinese stock market datasets against 8 competitive baselines demonstrate that AlphaPROBE significantly gains enhanced performance in predictive accuracy, return stability and training efficiency. Our results confirm that leveraging global evolutionary topology is essential for efficient and robust automated alpha discovery. We have open-sourced our implementation at https://github.com/gta0804/AlphaPROBE.
Related papers
- Alpha-R1: Alpha Screening with LLM Reasoning via Reinforcement Learning [28.326583684637853]
Signal decay and regime shifts pose recurring challenges for data-driven investment strategies in non-stationary markets.<n>Existing factor-based methods typically reduce alphas to numerical time series, overlooking the semantic rationale that determines when a factor is economically relevant.<n>We propose Alpha-R1, an 8B- parameter reasoning model trained via reinforcement learning for context-aware alpha screening.
arXiv Detail & Related papers (2025-12-29T14:50:23Z) - ThetaEvolve: Test-time Learning on Open Problems [110.5756538358217]
We introduce ThetaEvolve, an open-source framework that simplifies and extends AlphaEvolve to efficiently scale both in-context learning and Reinforcement Learning (RL) at test time.<n>We find that ThetaEvolve with RL at test-time consistently outperforms inference-only baselines.
arXiv Detail & Related papers (2025-11-28T18:58:14Z) - Cognitive Alpha Mining via LLM-Driven Code-Based Evolution [29.71597480304934]
We introduce the Cognitive Alpha Mining Framework (CogAlpha), which combines code-level alpha representation with LLM-driven reasoning and evolutionary search.<n>Treating LLMs as adaptive cognitive agents, our framework iteratively refines, mutates, and recombines alpha candidates through prompts and financial feedback.<n>Experiments on A-share equities demonstrate that CogAlpha consistently discovers alphas with superior predictive accuracy, robustness, and generalization over existing methods.
arXiv Detail & Related papers (2025-11-24T07:45:59Z) - Bidirectional Representations Augmented Autoregressive Biological Sequence Generation:Application in De Novo Peptide Sequencing [51.12821379640881]
Autoregressive (AR) models offer holistic, bidirectional representations but face challenges with generative coherence and scalability.<n>We propose a hybrid framework enhancing AR generation by dynamically integrating rich contextual information from non-autoregressive mechanisms.<n>A novel cross-decoder attention module enables the AR decoder to iteratively query and integrate these bidirectional features.
arXiv Detail & Related papers (2025-10-09T12:52:55Z) - AlphaEval: A Comprehensive and Efficient Evaluation Framework for Formula Alpha Mining [6.167227740097627]
Formula alpha mining, which generates predictive signals from financial data, is critical for quantitative investment.<n>Existing evaluation metrics predominantly include backtesting and correlation-based measures.<n>We propose AlphaEval, a unified, parallelizable, and backtest-free evaluation framework for automated alpha mining models.
arXiv Detail & Related papers (2025-08-10T11:19:24Z) - Navigating the Alpha Jungle: An LLM-Powered MCTS Framework for Formulaic Factor Mining [8.53606484300001]
This paper introduces a novel framework that integrates Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS)<n>A key innovation is the guidance of MCTS exploration by rich, quantitative feedback from financial backtesting of each candidate factor.<n> Experimental results on real-world stock market data demonstrate that our LLM-based framework outperforms existing methods by mining alphas with superior predictive accuracy and trading performance.
arXiv Detail & Related papers (2025-05-16T11:14:17Z) - Regulatory DNA sequence Design with Reinforcement Learning [56.20290878358356]
We propose a generative approach that leverages reinforcement learning to fine-tune a pre-trained autoregressive model.<n>We evaluate our method on promoter design tasks in two yeast media conditions and enhancer design tasks for three human cell types.
arXiv Detail & Related papers (2025-03-11T02:33:33Z) - AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors [14.80394452270726]
This paper proposes a two-stage alpha generating framework AlphaForge, for alpha factor mining and factor combination.<n> Experiments conducted on real-world datasets demonstrate that our proposed model outperforms contemporary benchmarks in formulaic alpha factor mining.
arXiv Detail & Related papers (2024-06-26T14:34:37Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - X-volution: On the unification of convolution and self-attention [52.80459687846842]
We propose a multi-branch elementary module composed of both convolution and self-attention operation.
The proposed X-volution achieves highly competitive visual understanding improvements.
arXiv Detail & Related papers (2021-06-04T04:32:02Z) - Phase Retrieval using Expectation Consistent Signal Recovery Algorithm
based on Hypernetwork [73.94896986868146]
Phase retrieval is an important component in modern computational imaging systems.
Recent advances in deep learning have opened up a new possibility for robust and fast PR.
We develop a novel framework for deep unfolding to overcome the existing limitations.
arXiv Detail & Related papers (2021-01-12T08:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.