Hyperparameter-free deep active learning for regression problems via
query synthesis
- URL: http://arxiv.org/abs/2201.12632v1
- Date: Sat, 29 Jan 2022 18:41:08 GMT
- Title: Hyperparameter-free deep active learning for regression problems via
query synthesis
- Authors: Simiao Ren, Yang Deng, Willie J. Padilla and Jordan Malof
- Abstract summary: We propose the first DAL query-synthesis approach for regression problems.
We use the recently-proposed neural-adjoint (NA) solver to efficiently find points in the continuous input domain.
We find that NA-QBC achieves better average performance than random sampling on every benchmark problem.
- Score: 5.572747615014008
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the past decade, deep active learning (DAL) has heavily focused upon
classification problems, or problems that have some 'valid' data manifolds,
such as natural languages or images. As a result, existing DAL methods are not
applicable to a wide variety of important problems -- such as many scientific
computing problems -- that involve regression on relatively unstructured input
spaces. In this work we propose the first DAL query-synthesis approach for
regression problems. We frame query synthesis as an inverse problem and use the
recently-proposed neural-adjoint (NA) solver to efficiently find points in the
continuous input domain that optimize the query-by-committee (QBC) criterion.
Crucially, the resulting NA-QBC approach removes the one sensitive
hyperparameter of the classical QBC active learning approach - the "pool size"-
making NA-QBC effectively hyperparameter free. This is significant because DAL
methods can be detrimental, even compared to random sampling, if the wrong
hyperparameters are chosen. We evaluate Random, QBC and NA-QBC sampling
strategies on four regression problems, including two contemporary scientific
computing problems. We find that NA-QBC achieves better average performance
than random sampling on every benchmark problem, while QBC can be detrimental
if the wrong hyperparameters are chosen.
Related papers
- Enhancing Hypergradients Estimation: A Study of Preconditioning and
Reparameterization [49.73341101297818]
Bilevel optimization aims to optimize an outer objective function that depends on the solution to an inner optimization problem.
The conventional method to compute the so-called hypergradient of the outer problem is to use the Implicit Function Theorem (IFT)
We study the error of the IFT method and analyze two strategies to reduce this error.
arXiv Detail & Related papers (2024-02-26T17:09:18Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - OptBA: Optimizing Hyperparameters with the Bees Algorithm for Improved Medical Text Classification [0.0]
We propose OptBA to fine-tune the hyperparameters of deep learning models by leveraging the Bees Algorithm.
Experimental results demonstrate a noteworthy enhancement in accuracy with approximately 1.4%.
arXiv Detail & Related papers (2023-03-14T16:04:13Z) - Contrastive Neural Ratio Estimation for Simulation-based Inference [15.354874711988662]
Likelihood-to-evidence ratio estimation is usually cast as either a binary (NRE-A) or a multiclass (NRE-B) classification task.
In contrast to the binary classification framework, the current formulation of the multiclass version has an intrinsic and unknown bias term.
We propose a multiclass framework free from the bias inherent to NRE-B at optimum, leaving us in the position to run diagnostics that practitioners depend on.
arXiv Detail & Related papers (2022-10-11T00:12:51Z) - A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization
Method [0.0]
We propose a gradient-based bilevel method for solving the hyperparameter optimization problem.
We show that the proposed method converges with lower computation and leads to models that generalize better on the testing set.
arXiv Detail & Related papers (2022-08-25T14:25:16Z) - A Hypergradient Approach to Robust Regression without Correspondence [85.49775273716503]
We consider a variant of regression problem, where the correspondence between input and output data is not available.
Most existing methods are only applicable when the sample size is small.
We propose a new computational framework -- ROBOT -- for the shuffled regression problem.
arXiv Detail & Related papers (2020-11-30T21:47:38Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z) - Hardness of Random Optimization Problems for Boolean Circuits,
Low-Degree Polynomials, and Langevin Dynamics [78.46689176407936]
We show that families of algorithms fail to produce nearly optimal solutions with high probability.
For the case of Boolean circuits, our results improve the state-of-the-art bounds known in circuit complexity theory.
arXiv Detail & Related papers (2020-04-25T05:45:59Z) - Weighted Random Search for Hyperparameter Optimization [0.0]
We introduce an improved version of Random Search (RS), used here for hyper parameter optimization of machine learning algorithms.
We generate new values for each hyper parameter with a probability of change, unlike the standard RS.
Within the same computational budget, our method yields better results than the standard RS.
arXiv Detail & Related papers (2020-04-03T15:41:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.