Regularization Penalty Optimization for Addressing Data Quality Variance
in OoD Algorithms
- URL: http://arxiv.org/abs/2206.05749v1
- Date: Sun, 12 Jun 2022 14:36:04 GMT
- Title: Regularization Penalty Optimization for Addressing Data Quality Variance
in OoD Algorithms
- Authors: Runpeng Yu, Hong Zhu, Kaican Li, Lanqing Hong, Rui Zhang, Nanyang Ye,
Shao-Lun Huang, Xiuqiang He
- Abstract summary: We theoretically reveal the relationship between training data quality and algorithm performance.
A novel algorithm is proposed to alleviate the influence of low-quality data at both the sample level and the domain level.
- Score: 45.02465532852302
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the poor generalization performance of traditional empirical risk
minimization (ERM) in the case of distributional shift, Out-of-Distribution
(OoD) generalization algorithms receive increasing attention. However, OoD
generalization algorithms overlook the great variance in the quality of
training data, which significantly compromises the accuracy of these methods.
In this paper, we theoretically reveal the relationship between training data
quality and algorithm performance and analyze the optimal regularization scheme
for Lipschitz regularized invariant risk minimization. A novel algorithm is
proposed based on the theoretical results to alleviate the influence of
low-quality data at both the sample level and the domain level. The experiments
on both the regression and classification benchmarks validate the effectiveness
of our method with statistical significance.
Related papers
- Stability and Generalization of the Decentralized Stochastic Gradient
Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm.
Our results analyze the impact of different top factors on the generalization of D-SGDA.
We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z) - Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates [28.700663352789395]
We provide a unified convergence analysis of communication-efficient local training methods for distributed variational inequality problems (VIPs)
Our approach is based on a general key assumption on the estimates that allows us to propose and analyze several novel local training algorithms.
We present the first local descent-accent algorithms with provable improved communication complexity for solving distributed variational inequalities on heterogeneous data.
arXiv Detail & Related papers (2023-06-08T10:58:46Z) - Best Subset Selection in Reduced Rank Regression [1.4699455652461724]
We show that our algorithm can achieve the reduced rank estimation with a significant probability.
The numerical studies and an application in the cancer studies demonstrate effectiveness and scalability.
arXiv Detail & Related papers (2022-11-29T02:51:15Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Distribution Learning Based on Evolutionary Algorithm Assisted Deep
Neural Networks for Imbalanced Image Classification [4.037464966510278]
We propose an iMproved Estimation Distribution Algorithm based Latent featUre Distribution Evolution (MEDA_LUDE) algorithm.
Experiments on benchmark based imbalanced datasets validate the effectiveness of our proposed algorithm.
The MEDA_LUDE algorithm is also applied to the industrial field and successfully alleviates the imbalanced issue in fabric defect classification.
arXiv Detail & Related papers (2022-07-26T08:51:47Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Learning Prediction Intervals for Regression: Generalization and
Calibration [12.576284277353606]
We study the generation of prediction intervals in regression for uncertainty quantification.
We use a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes.
We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
arXiv Detail & Related papers (2021-02-26T17:55:30Z) - Towards Optimal Problem Dependent Generalization Error Bounds in
Statistical Learning Theory [11.840747467007963]
We study problem-dependent rates that scale near-optimally with the variance, the effective loss errors, or the norms evaluated at the "best gradient hypothesis"
We introduce a principled framework dubbed "uniform localized convergence"
We show that our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches.
arXiv Detail & Related papers (2020-11-12T04:07:29Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.