Accelerated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models
- URL: http://arxiv.org/abs/2510.08465v1
- Date: Thu, 09 Oct 2025 17:07:36 GMT
- Title: Accelerated Aggregated D-Optimal Designs for Estimating Main Effects in Black-Box Models
- Authors: Chih-Yu Chang, Ming-Chung Chang,
- Abstract summary: We propose A2D2E, an $textbfE$stimator based on $textbfA$ccelerated $textbfA$ggregated $textbfD$esigns.<n>We establish theoretical guarantees, including convergence and variance reduction, and validate A2D2E through extensive simulations.
- Score: 3.093890460224435
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in supervised learning have driven growing interest in explaining black-box models, particularly by estimating the effects of input variables on model predictions. However, existing approaches often face key limitations, including poor scalability, sensitivity to out-of-distribution sampling, and instability under correlated features. To address these issues, we propose A2D2E, an $\textbf{E}$stimator based on $\textbf{A}$ccelerated $\textbf{A}$ggregated $\textbf{D}$-Optimal $\textbf{D}$esigns. Our method leverages principled experimental design to improve efficiency and robustness in main effect estimation. We establish theoretical guarantees, including convergence and variance reduction, and validate A2D2E through extensive simulations. We further provide the potential of the proposed method with a case study on real data and applications in language models. The code to reproduce the results can be found at https://github.com/cchihyu/A2D2E.
Related papers
- Fault-Tolerant Evaluation for Sample-Efficient Model Performance Estimators [13.227055178509524]
We propose a fault-tolerant evaluation framework that integrates bias and variance considerations within an adjustable tolerance level.<n>We show that proper calibration of $varepsilon$ ensures reliable evaluation across different variance regimes.<n> Experiments on real-world datasets demonstrate that our framework provides comprehensive and actionable insights into estimator behavior.
arXiv Detail & Related papers (2026-02-06T22:14:46Z) - ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models [102.4511331368587]
ARISE (Adaptive Resolution-aware Scaling Evaluation) is a novel metric designed to assess the test-time scaling effectiveness of large reasoning models.<n>We conduct comprehensive experiments evaluating state-of-the-art reasoning models across diverse domains.
arXiv Detail & Related papers (2025-10-07T15:10:51Z) - Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation [95.3977252782181]
Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions.<n>We introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way.
arXiv Detail & Related papers (2025-04-20T09:07:10Z) - Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.<n>The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.<n>The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors [58.661454334877256]
Drug-Target binding Affinity (DTA) prediction is essential for drug discovery.
Despite the application of deep learning methods to DTA prediction, the achieved accuracy remain suboptimal.
We propose $k$NN-DTA, a non-representation embedding-based retrieval method adopted on a pre-trained DTA prediction model.
arXiv Detail & Related papers (2024-07-21T15:49:05Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.<n>We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - SimAD: A Simple Dissimilarity-based Approach for Time Series Anomaly Detection [23.684577046512747]
We introduce a $textbfSim$ple dissimilarity-based approach for time series $textbfA$nomaly $textbfD$etection, referred to as $textbfSimAD$.<n>SimAD first incorporates a patching-based feature extractor capable of processing extended temporal windows and employs the EmbedPatch encoder to fully integrate normal behavioral patterns.<n>Second, we design an innovative ContrastFusion module in SimAD, which strengthens the robustness of anomaly detection by highlighting the distributional differences between normal and abnormal data.
arXiv Detail & Related papers (2024-05-18T09:37:04Z) - Efficient Epistemic Uncertainty Estimation in Regression Ensemble Models Using Pairwise-Distance Estimators [12.460684753030899]
Pairwise-distance estimators (PaiDEs) establish bounds on entropy.<n>Unlike sample-based Monte Carlo estimators, PaiDEs exhibit a remarkable capability to estimate epistemic uncertainty at speeds up to 100 times faster.<n>We compare our approach to existing active learning methods and find that our approach outperforms on high-dimensional regression tasks.
arXiv Detail & Related papers (2023-08-25T17:13:42Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - A Huber loss-based super learner with applications to healthcare
expenditures [0.0]
We propose a super learner based on the Huber loss, a "robust" loss function that combines squared error loss with absolute loss to downweight.
We show that the proposed method can be used both directly to optimize Huber risk, as well as in finite-sample settings.
arXiv Detail & Related papers (2022-05-13T19:57:50Z) - Inference in Bayesian Additive Vector Autoregressive Tree Models [0.0]
We propose combining Vector autoregressive ( VAR) models with Bayesian additive regression tree (BART) models.
The resulting BAVART model is capable of capturing arbitrary non-linear relations without much input from the researcher.
We apply our model to two datasets: the US term structure of interest rates and the Eurozone economy.
arXiv Detail & Related papers (2020-06-29T19:37:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.