Cost-Sensitive Self-Training for Optimizing Non-Decomposable Metrics
- URL: http://arxiv.org/abs/2304.14738v1
- Date: Fri, 28 Apr 2023 10:31:12 GMT
- Title: Cost-Sensitive Self-Training for Optimizing Non-Decomposable Metrics
- Authors: Harsh Rangwani, Shrinivas Ramasubramanian, Sho Takemori, Kato Takashi,
Yuhei Umeda, Venkatesh Babu Radhakrishnan
- Abstract summary: We introduce the Cost-Sensitive Self-Training (CSST) framework which generalizes the self-training-based methods for optimizing non-decomposable metrics.
Our results demonstrate that CSST achieves an improvement over the state-of-the-art in majority of the cases across datasets and objectives.
- Score: 9.741019160068388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-training based semi-supervised learning algorithms have enabled the
learning of highly accurate deep neural networks, using only a fraction of
labeled data. However, the majority of work on self-training has focused on the
objective of improving accuracy, whereas practical machine learning systems can
have complex goals (e.g. maximizing the minimum of recall across classes, etc.)
that are non-decomposable in nature. In this work, we introduce the
Cost-Sensitive Self-Training (CSST) framework which generalizes the
self-training-based methods for optimizing non-decomposable metrics. We prove
that our framework can better optimize the desired non-decomposable metric
utilizing unlabeled data, under similar data distribution assumptions made for
the analysis of self-training. Using the proposed CSST framework, we obtain
practical self-training methods (for both vision and NLP tasks) for optimizing
different non-decomposable metrics using deep neural networks. Our results
demonstrate that CSST achieves an improvement over the state-of-the-art in
majority of the cases across datasets and objectives.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Can we achieve robustness from data alone? [0.7366405857677227]
Adversarial training and its variants have come to be the prevailing methods to achieve adversarially robust classification using neural networks.
We devise a meta-learning method for robust classification, that optimize the dataset prior to its deployment in a principled way.
Experiments on MNIST and CIFAR-10 demonstrate that the datasets we produce enjoy very high robustness against PGD attacks.
arXiv Detail & Related papers (2022-07-24T12:14:48Z) - Training Neural Networks using SAT solvers [1.0152838128195465]
We propose an algorithm to explore the global optimisation method, using SAT solvers, for training a neural net.
In the experiments, we demonstrate the effectiveness of our algorithm against the ADAM optimiser in certain tasks like parity learning.
arXiv Detail & Related papers (2022-06-10T01:31:12Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Training With Data Dependent Dynamic Learning Rates [8.833548357664608]
We propose an optimization framework which accounts for difference in loss function characteristics across instances.
Our framework learns a dynamic learning rate for each instance present in the dataset.
We show that our framework can be used for personalization of a machine learning model towards a known targeted data distribution.
arXiv Detail & Related papers (2021-05-27T21:52:29Z) - Learning Robust Beamforming for MISO Downlink Systems [14.429561340880074]
A base station identifies efficient multi-antenna transmission strategies only with imperfect channel state information (CSI) and its features.
We propose a robust training algorithm where a deep neural network (DNN) is optimized to fit to real-world propagation environment.
arXiv Detail & Related papers (2021-03-02T09:56:35Z) - Efficient Conditional Pre-training for Transfer Learning [71.01129334495553]
We propose efficient filtering methods to select relevant subsets from the pre-training dataset.
We validate our techniques by pre-training on ImageNet in both the unsupervised and supervised settings.
We improve standard ImageNet pre-training by 1-3% by tuning available models on our subsets and pre-training on a dataset filtered from a larger scale dataset.
arXiv Detail & Related papers (2020-11-20T06:16:15Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Generalized Reinforcement Meta Learning for Few-Shot Optimization [3.7675996866306845]
We present a generic and flexible Reinforcement Learning (RL) based meta-learning framework for the problem of few-shot learning.
Our framework could be easily extended to do network architecture search.
arXiv Detail & Related papers (2020-05-04T03:21:05Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.