FDR-Controlled Portfolio Optimization for Sparse Financial Index
  Tracking
        - URL: http://arxiv.org/abs/2401.15139v2
- Date: Tue, 30 Jan 2024 17:57:12 GMT
- Title: FDR-Controlled Portfolio Optimization for Sparse Financial Index
  Tracking
- Authors: Jasin Machkour, Daniel P. Palomar, Michael Muma
- Abstract summary: In high-dimensional data analysis, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR)
We have expanded the T-Rex framework to accommodate overlapping groups of highly correlated variables.
This is achieved by integrating a nearest neighbors penalization mechanism into the framework, which provably controls the FDR at the user-defined target level.
- Score: 10.86851797584794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   In high-dimensional data analysis, such as financial index tracking or
biomedical applications, it is crucial to select the few relevant variables
while maintaining control over the false discovery rate (FDR). In these
applications, strong dependencies often exist among the variables (e.g., stock
returns), which can undermine the FDR control property of existing methods like
the model-X knockoff method or the T-Rex selector. To address this issue, we
have expanded the T-Rex framework to accommodate overlapping groups of highly
correlated variables. This is achieved by integrating a nearest neighbors
penalization mechanism into the framework, which provably controls the FDR at
the user-defined target level. A real-world example of sparse index tracking
demonstrates the proposed method's ability to accurately track the S&P 500
index over the past 20 years based on a small number of stocks. An open-source
implementation is provided within the R package TRexSelector on CRAN.
 
      
        Related papers
        - Label-shift robust federated feature screening for high-dimensional   classification [14.252760098879186]
 This paper introduces a general framework that unifies existing screening methods and proposes a novel utility, label-shift robust federated feature screening (LR-FFS)<n>Building upon this framework, LR-FFS leverages conditional distribution functions and expectations to address label shift without adding computational burdens.<n> Experimental results and theoretical analyses demonstrate LR-FFS's superior performance across diverse client environments.
 arXiv  Detail & Related papers  (2025-05-31T04:14:49Z)
- Regression-Based Estimation of Causal Effects in the Presence of   Selection Bias and Confounding [52.1068936424622]
 We consider the problem of estimating the expected causal effect $E[Y|do(X)]$ for a target variable $Y$ when treatment $X$ is set by intervention.
In settings without selection bias or confounding, $E[Y|do(X)] = E[Y|X]$, which can be estimated using standard regression methods.
We propose a framework that incorporates both selection bias and confounding.
 arXiv  Detail & Related papers  (2025-03-26T13:43:37Z)
- Representation-based Reward Modeling for Efficient Safety Alignment of   Large Language Model [84.00480999255628]
 Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift.
Current approaches typically address this issue through online sampling from the target policy.
We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
 arXiv  Detail & Related papers  (2025-03-13T06:40:34Z)
- Robust Offline Reinforcement Learning with Linearly Structured   $f$-Divergence Regularization [10.465789490644031]
 We propose a novel framework for robust regularized Markov decision process ($d$-RRMDP)
For the offline RL setting, we develop a family of algorithms, Robust Regularized Pessimistic Value Iteration (R2PVI)
 arXiv  Detail & Related papers  (2024-11-27T18:57:03Z)
- RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable   Data Rewards [78.74923079748521]
 Retrieval-Augmented Generation (RAG) has proven its effectiveness in mitigating hallucinations in Large Language Models (LLMs)
Current approaches use instruction tuning to optimize LLMs, improving their ability to utilize retrieved knowledge.
We propose a Differentiable Data Rewards ( DDR) method, which trains RAG systems by aligning data preferences between different RAG modules.
 arXiv  Detail & Related papers  (2024-10-17T12:53:29Z)
- High-Dimensional False Discovery Rate Control for Dependent Variables [10.86851797584794]
 We propose a dependency-aware T-Rex selector that harnesses the dependency structure among variables.
We prove that our variable penalization mechanism ensures FDR control.
We formulate a fully integrated optimal calibration algorithm that concurrently determines the parameters of the graphical model and the T-Rex framework.
 arXiv  Detail & Related papers  (2024-01-28T22:56:16Z)
- Off-Policy Evaluation for Large Action Spaces via Policy Convolution [60.6953713877886]
 Policy Convolution family of estimators uses latent structure within actions to strategically convolve the logging and target policies.
Experiments on synthetic and benchmark datasets demonstrate remarkable mean squared error (MSE) improvements when using PC.
 arXiv  Detail & Related papers  (2023-10-24T01:00:01Z)
- Probabilistic Model Incorporating Auxiliary Covariates to Control FDR [6.270317798744481]
 Controlling False Discovery Rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science.
We propose a deep Black-Box framework controlling FDR (named as NeurT-FDR) which boosts statistical power and controls FDR for multiple-hypothesis testing.
We show that NeurT-FDR makes substantially more discoveries in three real datasets compared to competitive baselines.
 arXiv  Detail & Related papers  (2022-10-06T19:35:53Z)
- DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over
  Graphs [54.08445874064361]
 We propose to solve a regularized distributionally robust learning problem in the decentralized setting.
By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust problem.
We show that our proposed algorithm can improve the worst distribution test accuracy by up to $10%$.
 arXiv  Detail & Related papers  (2022-08-29T18:01:42Z)
- Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
 We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
 arXiv  Detail & Related papers  (2022-03-09T01:55:59Z)
- Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
 We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
 arXiv  Detail & Related papers  (2021-06-14T05:39:09Z)
- Adversarial Robustness Guarantees for Gaussian Processes [22.403365399119107]
 Gaussian processes (GPs) enable principled computation of model uncertainty, making them attractive for safety-critical applications.
We present a framework to analyse adversarial robustness of GPs, defined as invariance of the model's decision to bounded perturbations.
We develop a branch-and-bound scheme to refine the bounds and show, for any $epsilon > 0$, that our algorithm is guaranteed to converge to values $epsilon$-close to the actual values in finitely many iterations.
 arXiv  Detail & Related papers  (2021-04-07T15:14:56Z)
- NeurT-FDR: Controlling FDR by Incorporating Feature Hierarchy [7.496622386458525]
 We propose NeurT-FDR which boosts statistical power and controls FDR for multiple hypothesis testing.
We show that NeurT-FDR has strong FDR guarantees and makes substantially more discoveries in synthetic and real datasets.
 arXiv  Detail & Related papers  (2021-01-24T21:55:10Z)
- Lower bounds in multiple testing: A framework based on derandomized
  proxies [107.69746750639584]
 This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models.
We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.
 arXiv  Detail & Related papers  (2020-05-07T19:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.