Boosting methods for interval-censored data with regression and classification
- URL: http://arxiv.org/abs/2601.17973v1
- Date: Sun, 25 Jan 2026 20:05:57 GMT
- Title: Boosting methods for interval-censored data with regression and classification
- Authors: Yuan Bian, Grace Y. Yi, Wenqing He,
- Abstract summary: We introduce novel non-parametric boosting methods for regression and classification tasks with interval-censored data.<n>Our approaches leverage censoring unbiased transformations to adjust loss functions and impute transformed responses.<n>We rigorously establish their theoretical properties, including optimality and mean squared error trade-offs.
- Score: 4.5041160025507585
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Boosting has garnered significant interest across both machine learning and statistical communities. Traditional boosting algorithms, designed for fully observed random samples, often struggle with real-world problems, particularly with interval-censored data. This type of data is common in survival analysis and time-to-event studies where exact event times are unobserved but fall within known intervals. Effective handling of such data is crucial in fields like medical research, reliability engineering, and social sciences. In this work, we introduce novel nonparametric boosting methods for regression and classification tasks with interval-censored data. Our approaches leverage censoring unbiased transformations to adjust loss functions and impute transformed responses while maintaining model accuracy. Implemented via functional gradient descent, these methods ensure scalability and adaptability. We rigorously establish their theoretical properties, including optimality and mean squared error trade-offs. Our proposed methods not only offer a robust framework for enhancing predictive accuracy in domains where interval-censored data are common but also complement existing work, expanding the applicability of existing boosting techniques. Empirical studies demonstrate robust performance across various finite-sample scenarios, highlighting the practical utility of our approaches.
Related papers
- Explainable Human-in-the-Loop Segmentation via Critic Feedback Signals [0.20999222360659608]
We propose a human-in-the-loop interactive framework that enables interventional learning through targeted human corrections of segmentation outputs.<n>We demonstrate that our framework improves segmentation accuracy by up to 9 mIoU points on challenging cubemap data.<n>This work provides a practical framework for researchers and practitioners seeking to build segmentation systems that are accurate, robust to dataset biases, data-efficient, and adaptable to real-world domains such as urban climate monitoring and autonomous driving.
arXiv Detail & Related papers (2025-10-11T01:16:41Z) - Simple and Effective Specialized Representations for Fair Classifiers [8.489574504527196]
We propose a novel approach to fair classification based on the characteristic function distance.<n>By utilizing characteristic functions, we achieve a more stable and efficient solution compared to traditional methods.<n>Our method maintains robustness and computational efficiency, making it a practical solution for real-world applications.
arXiv Detail & Related papers (2025-05-16T22:59:46Z) - On the Interconnections of Calibration, Quantification, and Classifier Accuracy Prediction under Dataset Shift [58.91436551466064]
This paper investigates the interconnections among three fundamental problems, calibration, and quantification, under dataset shift conditions.<n>We show that access to an oracle for any one of these tasks enables the resolution of the other two.<n>We propose new methods for each problem based on direct adaptations of well-established methods borrowed from the other disciplines.
arXiv Detail & Related papers (2025-05-16T15:42:55Z) - Instance-Specific Asymmetric Sensitivity in Differential Privacy [2.855485723554975]
We build upon previous work that gives a paradigm for selecting an output through the exponential mechanism.
Our framework will slightly modify the closeness metric and instead give a simple and efficient application of the sparse vector technique.
arXiv Detail & Related papers (2023-11-02T05:01:45Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Semantic Perturbations with Normalizing Flows for Improved
Generalization [62.998818375912506]
We show that perturbations in the latent space can be used to define fully unsupervised data augmentations.
We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective.
arXiv Detail & Related papers (2021-08-18T03:20:00Z) - Weight-of-evidence 2.0 with shrinkage and spline-binning [3.925373521409752]
We propose a formalized, data-driven and powerful method to transform categorical predictors.
We extend upon the weight-of-evidence approach and propose to estimate the proportions using shrinkage estimators.
We present the results of a series of experiments in a fraud detection setting, which illustrate the effectiveness of the presented approach.
arXiv Detail & Related papers (2021-01-05T13:13:16Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features.
We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach.
Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.