Related papers: Regularization Penalty Optimization for Addressing Data Quality Variance in OoD Algorithms

Regularization Penalty Optimization for Addressing Data Quality Variance in OoD Algorithms

URL: http://arxiv.org/abs/2206.05749v1
Date: Sun, 12 Jun 2022 14:36:04 GMT
Title: Regularization Penalty Optimization for Addressing Data Quality Variance in OoD Algorithms
Authors: Runpeng Yu, Hong Zhu, Kaican Li, Lanqing Hong, Rui Zhang, Nanyang Ye, Shao-Lun Huang, Xiuqiang He
Abstract summary: We theoretically reveal the relationship between training data quality and algorithm performance. A novel algorithm is proposed to alleviate the influence of low-quality data at both the sample level and the domain level.
Score: 45.02465532852302
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to the poor generalization performance of traditional empirical risk minimization (ERM) in the case of distributional shift, Out-of-Distribution (OoD) generalization algorithms receive increasing attention. However, OoD generalization algorithms overlook the great variance in the quality of training data, which significantly compromises the accuracy of these methods. In this paper, we theoretically reveal the relationship between training data quality and algorithm performance and analyze the optimal regularization scheme for Lipschitz regularized invariant risk minimization. A novel algorithm is proposed based on the theoretical results to alleviate the influence of low-quality data at both the sample level and the domain level. The experiments on both the regression and classification benchmarks validate the effectiveness of our method with statistical significance.

Related papers

Principled Algorithms for Optimizing Generalized Metrics in Binary Classification [53.604375124674796]
We introduce principled algorithms for optimizing generalized metrics, supported by $H$-consistency and finite-sample generalization bounds.<n>Our approach reformulates metric optimization as a generalized cost-sensitive learning problem.<n>We develop new algorithms, METRO, with strong theoretical performance guarantees.
arXiv Detail & Related papers (2025-12-29T01:33:42Z)
Exploring Variance Reduction in Importance Sampling for Efficient DNN Training [1.7767466724342067]
This paper proposes a method for estimating variance reduction during deep neural network (DNN) training using only minibatches sampled under importance sampling. An absolute metric to quantify the efficiency of importance sampling is also introduced as well as an algorithm for real-time estimation of importance scores based on moving gradient statistics.
arXiv Detail & Related papers (2025-01-23T00:43:34Z)
Stability and Generalization for Distributed SGDA [70.97400503482353]
We propose the stability-based generalization analytical framework for Distributed-SGDA. We conduct a comprehensive analysis of stability error, generalization gap, and population risk across different metrics. Our theoretical results reveal the trade-off between the generalization gap and optimization error.
arXiv Detail & Related papers (2024-11-14T11:16:32Z)
Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm [80.94861441583275]
We investigate the complexity of the generalization bound of the decentralized gradient descent (D-SGDA) algorithm. Our results analyze the impact of different top factors on the generalization of D-SGDA. We also balance it with the generalization to obtain the optimal convex-concave setting.
arXiv Detail & Related papers (2023-10-31T11:27:01Z)
Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates [28.700663352789395]
We provide a unified convergence analysis of communication-efficient local training methods for distributed variational inequality problems (VIPs) Our approach is based on a general key assumption on the estimates that allows us to propose and analyze several novel local training algorithms. We present the first local descent-accent algorithms with provable improved communication complexity for solving distributed variational inequalities on heterogeneous data.
arXiv Detail & Related papers (2023-06-08T10:58:46Z)
Best Subset Selection in Reduced Rank Regression [1.4699455652461724]
We show that our algorithm can achieve the reduced rank estimation with a significant probability. The numerical studies and an application in the cancer studies demonstrate effectiveness and scalability.
arXiv Detail & Related papers (2022-11-29T02:51:15Z)
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making. We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation. We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z)
Distribution Learning Based on Evolutionary Algorithm Assisted Deep Neural Networks for Imbalanced Image Classification [4.037464966510278]
We propose an iMproved Estimation Distribution Algorithm based Latent featUre Distribution Evolution (MEDA_LUDE) algorithm. Experiments on benchmark based imbalanced datasets validate the effectiveness of our proposed algorithm. The MEDA_LUDE algorithm is also applied to the industrial field and successfully alleviates the imbalanced issue in fabric defect classification.
arXiv Detail & Related papers (2022-07-26T08:51:47Z)
Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings. We exploit a warm-start strategy to amortize the estimation of the exact gradient. By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z)
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure. We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
Learning Prediction Intervals for Regression: Generalization and Calibration [12.576284277353606]
We study the generation of prediction intervals in regression for uncertainty quantification. We use a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes. We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.
arXiv Detail & Related papers (2021-02-26T17:55:30Z)
Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory [11.840747467007963]
We study problem-dependent rates that scale near-optimally with the variance, the effective loss errors, or the norms evaluated at the "best gradient hypothesis" We introduce a principled framework dubbed "uniform localized convergence" We show that our framework resolves several fundamental limitations of existing uniform convergence and localization analysis approaches.
arXiv Detail & Related papers (2020-11-12T04:07:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.