FDR-Controlled Portfolio Optimization for Sparse Financial Index
Tracking
- URL: http://arxiv.org/abs/2401.15139v2
- Date: Tue, 30 Jan 2024 17:57:12 GMT
- Title: FDR-Controlled Portfolio Optimization for Sparse Financial Index
Tracking
- Authors: Jasin Machkour, Daniel P. Palomar, Michael Muma
- Abstract summary: In high-dimensional data analysis, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR)
We have expanded the T-Rex framework to accommodate overlapping groups of highly correlated variables.
This is achieved by integrating a nearest neighbors penalization mechanism into the framework, which provably controls the FDR at the user-defined target level.
- Score: 10.86851797584794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In high-dimensional data analysis, such as financial index tracking or
biomedical applications, it is crucial to select the few relevant variables
while maintaining control over the false discovery rate (FDR). In these
applications, strong dependencies often exist among the variables (e.g., stock
returns), which can undermine the FDR control property of existing methods like
the model-X knockoff method or the T-Rex selector. To address this issue, we
have expanded the T-Rex framework to accommodate overlapping groups of highly
correlated variables. This is achieved by integrating a nearest neighbors
penalization mechanism into the framework, which provably controls the FDR at
the user-defined target level. A real-world example of sparse index tracking
demonstrates the proposed method's ability to accurately track the S&P 500
index over the past 20 years based on a small number of stocks. An open-source
implementation is provided within the R package TRexSelector on CRAN.
Related papers
- RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards [78.74923079748521]
Retrieval-Augmented Generation (RAG) has proven its effectiveness in mitigating hallucinations in Large Language Models (LLMs)
Current approaches use instruction tuning to optimize LLMs, improving their ability to utilize retrieved knowledge.
We propose a Differentiable Data Rewards ( DDR) method, which trains RAG systems by aligning data preferences between different RAG modules.
arXiv Detail & Related papers (2024-10-17T12:53:29Z) - High-Dimensional False Discovery Rate Control for Dependent Variables [10.86851797584794]
We propose a dependency-aware T-Rex selector that harnesses the dependency structure among variables.
We prove that our variable penalization mechanism ensures FDR control.
We formulate a fully integrated optimal calibration algorithm that concurrently determines the parameters of the graphical model and the T-Rex framework.
arXiv Detail & Related papers (2024-01-28T22:56:16Z) - Off-Policy Evaluation for Large Action Spaces via Policy Convolution [60.6953713877886]
Policy Convolution family of estimators uses latent structure within actions to strategically convolve the logging and target policies.
Experiments on synthetic and benchmark datasets demonstrate remarkable mean squared error (MSE) improvements when using PC.
arXiv Detail & Related papers (2023-10-24T01:00:01Z) - Probabilistic Model Incorporating Auxiliary Covariates to Control FDR [6.270317798744481]
Controlling False Discovery Rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science.
We propose a deep Black-Box framework controlling FDR (named as NeurT-FDR) which boosts statistical power and controls FDR for multiple-hypothesis testing.
We show that NeurT-FDR makes substantially more discoveries in three real datasets compared to competitive baselines.
arXiv Detail & Related papers (2022-10-06T19:35:53Z) - DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over
Graphs [54.08445874064361]
We propose to solve a regularized distributionally robust learning problem in the decentralized setting.
By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust problem.
We show that our proposed algorithm can improve the worst distribution test accuracy by up to $10%$.
arXiv Detail & Related papers (2022-08-29T18:01:42Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Adversarial Robustness Guarantees for Gaussian Processes [22.403365399119107]
Gaussian processes (GPs) enable principled computation of model uncertainty, making them attractive for safety-critical applications.
We present a framework to analyse adversarial robustness of GPs, defined as invariance of the model's decision to bounded perturbations.
We develop a branch-and-bound scheme to refine the bounds and show, for any $epsilon > 0$, that our algorithm is guaranteed to converge to values $epsilon$-close to the actual values in finitely many iterations.
arXiv Detail & Related papers (2021-04-07T15:14:56Z) - NeurT-FDR: Controlling FDR by Incorporating Feature Hierarchy [7.496622386458525]
We propose NeurT-FDR which boosts statistical power and controls FDR for multiple hypothesis testing.
We show that NeurT-FDR has strong FDR guarantees and makes substantially more discoveries in synthetic and real datasets.
arXiv Detail & Related papers (2021-01-24T21:55:10Z) - Lower bounds in multiple testing: A framework based on derandomized
proxies [107.69746750639584]
This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
arXiv Detail & Related papers (2020-05-07T19:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.