Efficient first-order predictor-corrector multiple objective
optimization for fair misinformation detection
- URL: http://arxiv.org/abs/2209.07245v1
- Date: Thu, 15 Sep 2022 12:32:15 GMT
- Title: Efficient first-order predictor-corrector multiple objective
optimization for fair misinformation detection
- Authors: Eric Enouen and Katja Mathesius and Sean Wang and Arielle Carr and
Sihong Xie
- Abstract summary: Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning.
We propose a Gauss-Newton approximation that only scales linearly, and that requires only first-order inner-product per iteration.
The innovations make predictor-corrector possible for large networks.
- Score: 5.139559672771439
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multiple-objective optimization (MOO) aims to simultaneously optimize
multiple conflicting objectives and has found important applications in machine
learning, such as minimizing classification loss and discrepancy in treating
different populations for fairness. At optimality, further optimizing one
objective will necessarily harm at least another objective, and decision-makers
need to comprehensively explore multiple optima (called Pareto front) to
pinpoint one final solution. We address the efficiency of finding the Pareto
front. First, finding the front from scratch using stochastic multi-gradient
descent (SMGD) is expensive with large neural networks and datasets. We propose
to explore the Pareto front as a manifold from a few initial optima, based on a
predictor-corrector method. Second, for each exploration step, the predictor
solves a large-scale linear system that scales quadratically in the number of
model parameters and requires one backpropagation to evaluate a second-order
Hessian-vector product per iteration of the solver. We propose a Gauss-Newton
approximation that only scales linearly, and that requires only first-order
inner-product per iteration. This also allows for a choice between the MINRES
and conjugate gradient methods when approximately solving the linear system.
The innovations make predictor-corrector possible for large networks.
Experiments on multi-objective (fairness and accuracy) misinformation detection
tasks show that 1) the predictor-corrector method can find Pareto fronts better
than or similar to SMGD with less time; and 2) the proposed first-order method
does not harm the quality of the Pareto front identified by the second-order
method, while further reduce running time.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Combining Kernelized Autoencoding and Centroid Prediction for Dynamic
Multi-objective Optimization [3.431120541553662]
This paper proposes a unified paradigm, which combines the kernelized autoncoding evolutionary search and the centriod-based prediction.
The proposed method is compared with five state-of-the-art algorithms on a number of complex benchmark problems.
arXiv Detail & Related papers (2023-12-02T00:24:22Z) - DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Training Neural Networks in Single vs Double Precision [8.036150169408241]
Conjugate Gradient and RMSprop algorithms are optimized for mean square error.
Experiments show that single-precision can keep up with double-precision if line search finds an improvement.
For strongly nonlinear tasks, both algorithm classes find only solutions fairly poor in terms of mean square error.
arXiv Detail & Related papers (2022-09-15T11:20:53Z) - RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries.
A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function.
Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z) - An Online Prediction Approach Based on Incremental Support Vector
Machine for Dynamic Multiobjective Optimization [19.336520152294213]
We propose a novel prediction algorithm based on incremental support vector machine (ISVM)
We treat the solving of dynamic multiobjective optimization problems (DMOPs) as an online learning process.
The proposed algorithm can effectively tackle dynamic multiobjective optimization problems.
arXiv Detail & Related papers (2021-02-24T08:51:23Z) - Efficient and Sparse Neural Networks by Pruning Weights in a
Multiobjective Learning Approach [0.0]
We propose a multiobjective perspective on the training of neural networks by treating its prediction accuracy and the network complexity as two individual objective functions.
Preliminary numerical results on exemplary convolutional neural networks confirm that large reductions in the complexity of neural networks with neglibile loss of accuracy are possible.
arXiv Detail & Related papers (2020-08-31T13:28:03Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - A Hybrid Two-layer Feature Selection Method Using GeneticAlgorithm and
Elastic Net [6.85316573653194]
This paper presents a new hybrid two-layer feature selection approach that combines a wrapper and an embedded method.
The Genetic Algorithm(GA) has been adopted as a wrapper to search for the optimal subset of predictors.
A second layer is added to the proposed method to eliminate any remaining redundant/irrelevant predictors to improve the prediction accuracy.
arXiv Detail & Related papers (2020-01-30T05:01:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.