Functional sufficient dimension reduction through information
maximization with application to classification
- URL: http://arxiv.org/abs/2305.10880v3
- Date: Tue, 27 Feb 2024 03:49:34 GMT
- Title: Functional sufficient dimension reduction through information
maximization with application to classification
- Authors: Xinyu Li and Jianjun Xu and Wenquan Cui and Haoyang Cheng
- Abstract summary: Two novel functional sufficient dimensional reduction (FSDR) methods are proposed based on mutual information and square loss mutual information.
It is demonstrated that the two methods are competitive compared with some existing FSDR methods by simulations and real data analyses.
- Score: 7.577667173094585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Considering the case where the response variable is a categorical variable
and the predictor is a random function, two novel functional sufficient
dimensional reduction (FSDR) methods are proposed based on mutual information
and square loss mutual information. Compared to the classical FSDR methods,
such as functional sliced inverse regression and functional sliced average
variance estimation, the proposed methods are appealing because they are
capable of estimating multiple effective dimension reduction directions in the
case of a relatively small number of categories, especially for the binary
response. Moreover, the proposed methods do not require the restrictive linear
conditional mean assumption and the constant covariance assumption. They avoid
the inverse problem of the covariance operator which is often encountered in
the functional sufficient dimension reduction. The functional principal
component analysis with truncation be used as a regularization mechanism. Under
some mild conditions, the statistical consistency of the proposed methods is
established. It is demonstrated that the two methods are competitive compared
with some existing FSDR methods by simulations and real data analyses.
Related papers
- RieszBoost: Gradient Boosting for Riesz Regression [49.737777802061984]
We propose a novel gradient boosting algorithm to directly estimate the Riesz representer without requiring its explicit analytical form.
We show that our algorithm performs on par with or better than indirect estimation techniques across a range of functionals.
arXiv Detail & Related papers (2025-01-08T23:04:32Z) - Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models.
We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z) - Adaptive debiased SGD in high-dimensional GLMs with streaming data [4.704144189806667]
We introduce a novel approach to online inference in high-dimensional generalized linear models.
Our method operates in a single-pass mode, significantly reducing both time and space complexity.
We demonstrate that our method, termed the Approximated Debiased Lasso (ADL), not only mitigates the need for the bounded individual probability condition but also significantly improves numerical performance.
arXiv Detail & Related papers (2024-05-28T15:36:48Z) - Contrastive inverse regression for dimension reduction [0.0]
We propose a supervised dimension reduction method called contrastive inverse regression (CIR) specifically designed for the contrastive setting.
CIR introduces an optimization problem defined on the Stiefel manifold with a non-standard loss function.
We prove the convergence of CIR to a local optimum using a gradient descent-based algorithm, and our numerical study empirically demonstrates the improved performance over competing methods for high-dimensional data.
arXiv Detail & Related papers (2023-05-20T21:44:11Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Off-policy estimation of linear functionals: Non-asymptotic theory for
semi-parametric efficiency [59.48096489854697]
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.
We prove non-asymptotic upper bounds on the mean-squared error of such procedures.
We establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds.
arXiv Detail & Related papers (2022-09-26T23:50:55Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Robust online joint state/input/parameter estimation of linear systems [0.0]
This paper presents a method for jointly estimating the state, input, and parameters of linear systems in an online fashion.
The method is specially designed for measurements that are corrupted with non-Gaussian noise or outliers.
arXiv Detail & Related papers (2022-04-12T09:41:28Z) - Nonlinear Level Set Learning for Function Approximation on Sparse Data
with Applications to Parametric Differential Equations [6.184270985214254]
"Nonlinear Level set Learning" (NLL) approach is presented for the pointwise prediction of functions which have been sparsely sampled.
The proposed algorithm effectively reduces the input dimension to the theoretical lower bound with minor accuracy loss.
Experiments and applications are presented which compare this modified NLL with the original NLL and the Active Subspaces (AS) method.
arXiv Detail & Related papers (2021-04-29T01:54:05Z) - Joint Dimensionality Reduction for Separable Embedding Estimation [43.22422640265388]
Low-dimensional embeddings for data from disparate sources play critical roles in machine learning, multimedia information retrieval, and bioinformatics.
We propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.
Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.
arXiv Detail & Related papers (2021-01-14T08:48:37Z) - Doubly Robust Semiparametric Difference-in-Differences Estimators with
High-Dimensional Data [15.27393561231633]
We propose a doubly robust two-stage semiparametric difference-in-difference estimator for estimating heterogeneous treatment effects.
The first stage allows a general set of machine learning methods to be used to estimate the propensity score.
In the second stage, we derive the rates of convergence for both the parametric parameter and the unknown function.
arXiv Detail & Related papers (2020-09-07T15:14:29Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.