Local Boosting for Weakly-Supervised Learning
- URL: http://arxiv.org/abs/2306.02859v1
- Date: Mon, 5 Jun 2023 13:24:03 GMT
- Title: Local Boosting for Weakly-Supervised Learning
- Authors: Rongzhi Zhang, Yue Yu, Jiaming Shen, Xiquan Cui, Chao Zhang
- Abstract summary: Boosting is a technique to enhance the performance of a set of base models by combining them into a strong ensemble model.
In weakly supervised learning, where most of the data is labeled through weak and noisy sources, it remains nontrivial to design effective boosting approaches.
We propose $textitLocalBoost$, a novel framework for weakly-supervised boosting.
- Score: 21.95003048165616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Boosting is a commonly used technique to enhance the performance of a set of
base models by combining them into a strong ensemble model. Though widely
adopted, boosting is typically used in supervised learning where the data is
labeled accurately. However, in weakly supervised learning, where most of the
data is labeled through weak and noisy sources, it remains nontrivial to design
effective boosting approaches. In this work, we show that the standard
implementation of the convex combination of base learners can hardly work due
to the presence of noisy labels. Instead, we propose $\textit{LocalBoost}$, a
novel framework for weakly-supervised boosting. LocalBoost iteratively boosts
the ensemble model from two dimensions, i.e., intra-source and inter-source.
The intra-source boosting introduces locality to the base learners and enables
each base learner to focus on a particular feature regime by training new base
learners on granularity-varying error regions. For the inter-source boosting,
we leverage a conditional function to indicate the weak source where the sample
is more likely to appear. To account for the weak labels, we further design an
estimate-then-modify approach to compute the model weights. Experiments on
seven datasets show that our method significantly outperforms vanilla boosting
methods and other weakly-supervised methods.
Related papers
- BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping [64.8477128397529]
We propose a training-required and training-free test-time adaptation framework.
We maintain a light-weight key-value memory for feature retrieval from instance-agnostic historical samples and instance-aware boosting samples.
We theoretically justify the rationality behind our method and empirically verify its effectiveness on both the out-of-distribution and the cross-domain datasets.
arXiv Detail & Related papers (2024-10-20T15:58:43Z) - Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method [53.170053108447455]
Ensemble learning is a method that leverages weak learners to produce a strong learner.
We design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative.
We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets.
arXiv Detail & Related papers (2024-08-06T03:42:38Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Learning to Generate Novel Classes for Deep Metric Learning [24.048915378172012]
We introduce a new data augmentation approach that synthesizes novel classes and their embedding vectors.
We implement this idea by learning and exploiting a conditional generative model, which, given a class label and a noise, produces a random embedding vector of the class.
Our proposed generator allows the loss to use richer class relations by augmenting realistic and diverse classes, resulting in better generalization to unseen samples.
arXiv Detail & Related papers (2022-01-04T06:55:19Z) - KGBoost: A Classification-based Knowledge Base Completion Method with
Negative Sampling [29.14178162494542]
KGBoost is a new method to train a powerful classifier for missing link prediction.
We conduct experiments on multiple benchmark datasets, and demonstrate that KGBoost outperforms state-of-the-art methods across most datasets.
As compared with models trained by end-to-end optimization, KGBoost works well under the low-dimensional setting so as to allow a smaller model size.
arXiv Detail & Related papers (2021-12-17T06:19:37Z) - Adaptive Boosting for Domain Adaptation: Towards Robust Predictions in
Scene Segmentation [41.05407168312345]
Domain adaptation is to transfer the shared knowledge learned from the source domain to a new environment, i.e., target domain.
One common practice is to train the model on both labeled source-domain data and unlabeled target-domain data.
We propose one efficient bootstrapping method, called Adaboost Student, explicitly learning complementary models during training.
arXiv Detail & Related papers (2021-03-29T15:12:58Z) - MP-Boost: Minipatch Boosting via Adaptive Feature and Observation
Sampling [0.0]
MP-Boost is an algorithm loosely based on AdaBoost that learns by adaptively selecting small subsets of instances and features.
We empirically demonstrate the interpretability, comparative accuracy, and computational time of our approach on a variety of binary classification tasks.
arXiv Detail & Related papers (2020-11-14T04:26:13Z) - BREEDS: Benchmarks for Subpopulation Shift [98.90314444545204]
We develop a methodology for assessing the robustness of models to subpopulation shift.
We leverage the class structure underlying existing datasets to control the data subpopulations that comprise the training and test distributions.
Applying this methodology to the ImageNet dataset, we create a suite of subpopulation shift benchmarks of varying granularity.
arXiv Detail & Related papers (2020-08-11T17:04:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.