Related papers: Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

URL: http://arxiv.org/abs/2209.07245v1
Date: Thu, 15 Sep 2022 12:32:15 GMT
Title: Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection
Authors: Eric Enouen and Katja Mathesius and Sean Wang and Arielle Carr and Sihong Xie
Abstract summary: Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning. We propose a Gauss-Newton approximation that only scales linearly, and that requires only first-order inner-product per iteration. The innovations make predictor-corrector possible for large networks.
Score: 5.139559672771439
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning, such as minimizing classification loss and discrepancy in treating different populations for fairness. At optimality, further optimizing one objective will necessarily harm at least another objective, and decision-makers need to comprehensively explore multiple optima (called Pareto front) to pinpoint one final solution. We address the efficiency of finding the Pareto front. First, finding the front from scratch using stochastic multi-gradient descent (SMGD) is expensive with large neural networks and datasets. We propose to explore the Pareto front as a manifold from a few initial optima, based on a predictor-corrector method. Second, for each exploration step, the predictor solves a large-scale linear system that scales quadratically in the number of model parameters and requires one backpropagation to evaluate a second-order Hessian-vector product per iteration of the solver. We propose a Gauss-Newton approximation that only scales linearly, and that requires only first-order inner-product per iteration. This also allows for a choice between the MINRES and conjugate gradient methods when approximately solving the linear system. The innovations make predictor-corrector possible for large networks. Experiments on multi-objective (fairness and accuracy) misinformation detection tasks show that 1) the predictor-corrector method can find Pareto fronts better than or similar to SMGD with less time; and 2) the proposed first-order method does not harm the quality of the Pareto front identified by the second-order method, while further reduce running time.

Related papers

Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information Analysis [0.0]
We present AdaFisher, a novel adaptive second-order tuning for deep neural networks (DNNs) AdaFisher aims to bridge the gap between the improved convergence and generalization of second-order methods and the computational efficiency needed for trainings. We demonstrate that AdaFisher outperforms state-of-the-art approximations in both accuracy and convergence speed.
arXiv Detail & Related papers (2025-04-26T05:02:21Z)
FERERO: A Flexible Framework for Preference-Guided Multi-Objective Learning [41.95837632934815]
We introduce a Flexible framEwork for pREfeRence-guided multi-Objective learning (FERERO) To solve this problem, convergent algorithms are developed with both single-loop and primal variants. Experiments on multiple benchmarks demonstrate the proposed method is very competitive in finding preference-guided optimal solutions.
arXiv Detail & Related papers (2024-12-02T18:21:16Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
Combining Kernelized Autoencoding and Centroid Prediction for Dynamic Multi-objective Optimization [3.431120541553662]
This paper proposes a unified paradigm, which combines the kernelized autoncoding evolutionary search and the centriod-based prediction. The proposed method is compared with five state-of-the-art algorithms on a number of complex benchmark problems.
arXiv Detail & Related papers (2023-12-02T00:24:22Z)
DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems. Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective. We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Training Neural Networks in Single vs Double Precision [8.036150169408241]
Conjugate Gradient and RMSprop algorithms are optimized for mean square error. Experiments show that single-precision can keep up with double-precision if line search finds an improvement. For strongly nonlinear tasks, both algorithm classes find only solutions fairly poor in terms of mean square error.
arXiv Detail & Related papers (2022-09-15T11:20:53Z)
RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries. A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function. Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z)
An Online Prediction Approach Based on Incremental Support Vector Machine for Dynamic Multiobjective Optimization [19.336520152294213]
We propose a novel prediction algorithm based on incremental support vector machine (ISVM) We treat the solving of dynamic multiobjective optimization problems (DMOPs) as an online learning process. The proposed algorithm can effectively tackle dynamic multiobjective optimization problems.
arXiv Detail & Related papers (2021-02-24T08:51:23Z)
Efficient and Sparse Neural Networks by Pruning Weights in a Multiobjective Learning Approach [0.0]
We propose a multiobjective perspective on the training of neural networks by treating its prediction accuracy and the network complexity as two individual objective functions. Preliminary numerical results on exemplary convolutional neural networks confirm that large reductions in the complexity of neural networks with neglibile loss of accuracy are possible.
arXiv Detail & Related papers (2020-08-31T13:28:03Z)
Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization. We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
A Hybrid Two-layer Feature Selection Method Using GeneticAlgorithm and Elastic Net [6.85316573653194]
This paper presents a new hybrid two-layer feature selection approach that combines a wrapper and an embedded method. The Genetic Algorithm(GA) has been adopted as a wrapper to search for the optimal subset of predictors. A second layer is added to the proposed method to eliminate any remaining redundant/irrelevant predictors to improve the prediction accuracy.
arXiv Detail & Related papers (2020-01-30T05:01:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.