Goal-Oriented Influence-Maximizing Data Acquisition for Learning and Optimization
- URL: http://arxiv.org/abs/2602.19578v1
- Date: Mon, 23 Feb 2026 07:57:11 GMT
- Title: Goal-Oriented Influence-Maximizing Data Acquisition for Learning and Optimization
- Authors: Weichi Yao, Bianca Dumitrascu, Bryan R. Goldsmith, Yixin Wang,
- Abstract summary: We propose an active acquisition algorithm that avoids explicit posterior inference while remaining uncertainty-aware through inverse curvature.<n>GOIMDA selects inputs by maximizing their expected influence on a user-specified goal functional.<n>We show theoretically that, for generalized linear models, GOIMDA approximates predictive-entropy minimization up to a correction term accounting for goal alignment and prediction bias.
- Score: 28.53710231018475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active data acquisition is central to many learning and optimization tasks in deep neural networks, yet remains challenging because most approaches rely on predictive uncertainty estimates that are difficult to obtain reliably. To this end, we propose Goal-Oriented Influence- Maximizing Data Acquisition (GOIMDA), an active acquisition algorithm that avoids explicit posterior inference while remaining uncertainty-aware through inverse curvature. GOIMDA selects inputs by maximizing their expected influence on a user-specified goal functional, such as test loss, predictive entropy, or the value of an optimizer-recommended design. Leveraging first-order influence functions, we derive a tractable acquisition rule that combines the goal gradient, training-loss curvature, and candidate sensitivity to model parameters. We show theoretically that, for generalized linear models, GOIMDA approximates predictive-entropy minimization up to a correction term accounting for goal alignment and prediction bias, thereby, yielding uncertainty-aware behavior without maintaining a Bayesian posterior. Empirically, across learning tasks (including image and text classification) and optimization tasks (including noisy global optimization benchmarks and neural-network hyperparameter tuning), GOIMDA consistently reaches target performance with substantially fewer labeled samples or function evaluations than uncertainty-based active learning and Gaussian-process Bayesian optimization baselines.
Related papers
- Bayesian Optimization on Networks [2.1034532187837636]
This paper studies optimization on networks modeled as metric graphs.<n>Motivated by applications where the objective function is expensive to evaluate or only available as a black box, we develop Bayesian optimization algorithms.
arXiv Detail & Related papers (2025-10-31T17:12:49Z) - Online Decision-Focused Learning [74.3205104323777]
Decision-focused learning (DFL) is an increasingly popular paradigm for training models whose predictive outputs are used in decision-making tasks.<n>In this paper, we regularize the objective function to make it different and investigate how to overcome nonoptimality function.<n>We also showcase the effectiveness of our algorithms on a knapsack experiment, where they outperform two standard benchmarks.
arXiv Detail & Related papers (2025-05-19T10:40:30Z) - Self-Evolutionary Large Language Models through Uncertainty-Enhanced Preference Optimization [9.618391485742968]
Iterative preference optimization has recently become one of the de-facto training paradigms for large language models (LLMs)
We present an uncertainty-enhanced textbfPreference textbfOptimization framework to make the LLM self-evolve with reliable feedback.
Our framework substantially alleviates the noisy problem and improves the performance of iterative preference optimization.
arXiv Detail & Related papers (2024-09-17T14:05:58Z) - Embedding generalization within the learning dynamics: An approach based-on sample path large deviation theory [0.0]
We consider an empirical risk perturbation based learning problem that exploits methods from continuous-time perspective.
We provide an estimate in the small noise limit based on the Freidlin-Wentzell theory of large deviations.
We also present a computational algorithm that solves the corresponding variational problem leading to an optimal point estimates.
arXiv Detail & Related papers (2024-08-04T23:31:35Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Score Function Gradient Estimation to Widen the Applicability of Decision-Focused Learning [17.962860438133312]
Decision-focused learning (DFL) paradigm overcomes limitation by training to directly minimize a task loss, e.g. regret.
We propose an alternative method that makes no such assumptions, it combines smoothing with score function estimation which works on any task loss.
Experiments show that it typically requires more epochs, but that it is on par with specialized methods and performs especially well for the difficult case of problems with uncertainty in the constraints, in terms of solution quality, scalability, or both.
arXiv Detail & Related papers (2023-07-11T12:32:13Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Scalable Bayesian Meta-Learning through Generalized Implicit Gradients [64.21628447579772]
Implicit Bayesian meta-learning (iBaML) method broadens the scope of learnable priors, but also quantifies the associated uncertainty.
Analytical error bounds are established to demonstrate the precision and efficiency of the generalized implicit gradient over the explicit one.
arXiv Detail & Related papers (2023-03-31T02:10:30Z) - Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Approximate Bayesian Optimisation for Neural Networks [6.921210544516486]
A body of work has been done to automate machine learning algorithm to highlight the importance of model choice.
The necessity to solve the analytical tractability and the computational feasibility in a idealistic fashion enables to ensure the efficiency and the applicability.
arXiv Detail & Related papers (2021-08-27T19:03:32Z) - Learning MDPs from Features: Predict-Then-Optimize for Sequential
Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning.
Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.