Related papers: On the Treatment of Optimization Problems with L1 Penalty Terms via Multiobjective Continuation

On the Treatment of Optimization Problems with L1 Penalty Terms via Multiobjective Continuation

URL: http://arxiv.org/abs/2012.07483v1
Date: Mon, 14 Dec 2020 13:00:50 GMT
Title: On the Treatment of Optimization Problems with L1 Penalty Terms via Multiobjective Continuation
Authors: Katharina Bieker, Bennet Gebken, Sebastian Peitz
Abstract summary: We present a novel algorithm that allows us to gain detailed insight into the effects of sparsity in linear and nonlinear optimization. Our method can be seen as a generalization of well-known homotopy methods for linear regression problems to the nonlinear case.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a novel algorithm that allows us to gain detailed insight into the effects of sparsity in linear and nonlinear optimization, which is of great importance in many scientific areas such as image and signal processing, medical imaging, compressed sensing, and machine learning (e.g., for the training of neural networks). Sparsity is an important feature to ensure robustness against noisy data, but also to find models that are interpretable and easy to analyze due to the small number of relevant terms. It is common practice to enforce sparsity by adding the $\ell_1$-norm as a weighted penalty term. In order to gain a better understanding and to allow for an informed model selection, we directly solve the corresponding multiobjective optimization problem (MOP) that arises when we minimize the main objective and the $\ell_1$-norm simultaneously. As this MOP is in general non-convex for nonlinear objectives, the weighting method will fail to provide all optimal compromises. To avoid this issue, we present a continuation method which is specifically tailored to MOPs with two objective functions one of which is the $\ell_1$-norm. Our method can be seen as a generalization of well-known homotopy methods for linear regression problems to the nonlinear case. Several numerical examples - including neural network training - demonstrate our theoretical findings and the additional insight that can be gained by this multiobjective approach.

Related papers

Scalable Higher-Order Tensor Product Spline Models [0.0]
We propose a new approach using a factorization method to derive a highly scalable higher-order tensor product spline model. Our method allows for the incorporation of all (higher-order) interactions of non-linear feature effects while having computational costs proportional to a model without interactions.
arXiv Detail & Related papers (2024-02-02T01:18:48Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
A multiobjective continuation method to compute the regularization path of deep neural networks [1.3654846342364308]
Sparsity is a highly feature in deep neural networks (DNNs) since it ensures numerical efficiency, improves the interpretability of models, and robustness. We present an algorithm that allows for the entire sparse front for the above-mentioned objectives in a very efficient manner for high-dimensional gradients with millions of parameters. We demonstrate that knowledge of the regularization path allows for a well-generalizing network parametrization.
arXiv Detail & Related papers (2023-08-23T10:08:52Z)
Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features. We find new and interesting properties that do not exist in single-task linear regression. Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z)
NTopo: Mesh-free Topology Optimization using Implicit Neural Representations [35.07884509198916]
We present a novel machine learning approach to tackle topology optimization problems. We use multilayer perceptrons (MLPs) to parameterize both density and displacement fields. As we show through our experiments, a major benefit of our approach is that it enables self-supervised learning of continuous solution spaces.
arXiv Detail & Related papers (2021-02-22T05:25:22Z)
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points. This problem setting emerges in many domains where function evaluation is a complex and expensive process. We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z)
Provable Multi-Objective Reinforcement Learning with Generative Models [98.19879408649848]
We study the problem of single policy MORL, which learns an optimal policy given the preference of objectives. Existing methods require strong assumptions such as exact knowledge of the multi-objective decision process. We propose a new algorithm called model-based envelop value (EVI) which generalizes the enveloped multi-objective $Q$-learning algorithm.
arXiv Detail & Related papers (2020-11-19T22:35:31Z)
Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning. We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z)
Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data. We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.