On the Treatment of Optimization Problems with L1 Penalty Terms via
Multiobjective Continuation
- URL: http://arxiv.org/abs/2012.07483v1
- Date: Mon, 14 Dec 2020 13:00:50 GMT
- Title: On the Treatment of Optimization Problems with L1 Penalty Terms via
Multiobjective Continuation
- Authors: Katharina Bieker, Bennet Gebken, Sebastian Peitz
- Abstract summary: We present a novel algorithm that allows us to gain detailed insight into the effects of sparsity in linear and nonlinear optimization.
Our method can be seen as a generalization of well-known homotopy methods for linear regression problems to the nonlinear case.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel algorithm that allows us to gain detailed insight into the
effects of sparsity in linear and nonlinear optimization, which is of great
importance in many scientific areas such as image and signal processing,
medical imaging, compressed sensing, and machine learning (e.g., for the
training of neural networks). Sparsity is an important feature to ensure
robustness against noisy data, but also to find models that are interpretable
and easy to analyze due to the small number of relevant terms. It is common
practice to enforce sparsity by adding the $\ell_1$-norm as a weighted penalty
term. In order to gain a better understanding and to allow for an informed
model selection, we directly solve the corresponding multiobjective
optimization problem (MOP) that arises when we minimize the main objective and
the $\ell_1$-norm simultaneously. As this MOP is in general non-convex for
nonlinear objectives, the weighting method will fail to provide all optimal
compromises. To avoid this issue, we present a continuation method which is
specifically tailored to MOPs with two objective functions one of which is the
$\ell_1$-norm. Our method can be seen as a generalization of well-known
homotopy methods for linear regression problems to the nonlinear case. Several
numerical examples - including neural network training - demonstrate our
theoretical findings and the additional insight that can be gained by this
multiobjective approach.
Related papers
- Scalable Higher-Order Tensor Product Spline Models [0.0]
We propose a new approach using a factorization method to derive a highly scalable higher-order tensor product spline model.
Our method allows for the incorporation of all (higher-order) interactions of non-linear feature effects while having computational costs proportional to a model without interactions.
arXiv Detail & Related papers (2024-02-02T01:18:48Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - A multiobjective continuation method to compute the regularization path of deep neural networks [1.3654846342364308]
Sparsity is a highly feature in deep neural networks (DNNs) since it ensures numerical efficiency, improves the interpretability of models, and robustness.
We present an algorithm that allows for the entire sparse front for the above-mentioned objectives in a very efficient manner for high-dimensional gradients with millions of parameters.
We demonstrate that knowledge of the regularization path allows for a well-generalizing network parametrization.
arXiv Detail & Related papers (2023-08-23T10:08:52Z) - Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - NTopo: Mesh-free Topology Optimization using Implicit Neural
Representations [35.07884509198916]
We present a novel machine learning approach to tackle topology optimization problems.
We use multilayer perceptrons (MLPs) to parameterize both density and displacement fields.
As we show through our experiments, a major benefit of our approach is that it enables self-supervised learning of continuous solution spaces.
arXiv Detail & Related papers (2021-02-22T05:25:22Z) - Offline Model-Based Optimization via Normalized Maximum Likelihood
Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points.
This problem setting emerges in many domains where function evaluation is a complex and expensive process.
We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z) - Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives.
Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process.
We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.