Learning to Minimize the Remainder in Supervised Learning
- URL: http://arxiv.org/abs/2201.09193v1
- Date: Sun, 23 Jan 2022 06:31:23 GMT
- Title: Learning to Minimize the Remainder in Supervised Learning
- Authors: Yan Luo, Yongkang Wong, Mohan Kankanhalli, Qi Zhao
- Abstract summary: We propose a new learning approach, namely gradient adjustment learning (GAL), to leverage the knowledge learned from the past training to adjust gradients.
The proposed GAL is model- and-agnostic, and is easy to adapt to the standard learning framework.
The experiments show that the proposed GAL consistently enhances the evaluated models, whereas the ablation studies validate various aspects of the proposed GAL.
- Score: 37.481538167715755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The learning process of deep learning methods usually updates the model's
parameters in multiple iterations. Each iteration can be viewed as the
first-order approximation of Taylor's series expansion. The remainder, which
consists of higher-order terms, is usually ignored in the learning process for
simplicity. This learning scheme empowers various multimedia based
applications, such as image retrieval, recommendation system, and video search.
Generally, multimedia data (e.g., images) are semantics-rich and
high-dimensional, hence the remainders of approximations are possibly non-zero.
In this work, we consider the remainder to be informative and study how it
affects the learning process. To this end, we propose a new learning approach,
namely gradient adjustment learning (GAL), to leverage the knowledge learned
from the past training iterations to adjust vanilla gradients, such that the
remainders are minimized and the approximations are improved. The proposed GAL
is model- and optimizer-agnostic, and is easy to adapt to the standard learning
framework. It is evaluated on three tasks, i.e., image classification, object
detection, and regression, with state-of-the-art models and optimizers. The
experiments show that the proposed GAL consistently enhances the evaluated
models, whereas the ablation studies validate various aspects of the proposed
GAL. The code is available at
\url{https://github.com/luoyan407/gradient_adjustment.git}.
Related papers
- A mean teacher algorithm for unlearning of language models [5.384630221560811]
We show that the mean teacher algorithm can approximate a trajectory of a slow natural gradient descent.
While slow NGD can suffer from vanishing gradients, we introduce a new unlearning loss called "negative log-unlikelihood" (NLUL) that avoids this problem.
arXiv Detail & Related papers (2025-04-18T00:34:19Z) - LoRA Unlearns More and Retains More (Student Abstract) [0.0]
PruneLoRA reduces the need for large-scale parameter updates by applying low-rank updates to the model.
We leverage LoRA to selectively modify a subset of the pruned model's parameters, thereby reducing the computational cost, memory requirements and improving the model's ability to retain performance on the remaining classes.
arXiv Detail & Related papers (2024-11-16T16:47:57Z) - Classifier-guided Gradient Modulation for Enhanced Multimodal Learning [50.7008456698935]
Gradient-Guided Modulation (CGGM) is a novel method to balance multimodal learning with gradients.
We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS.
CGGM outperforms all the baselines and other state-of-the-art methods consistently.
arXiv Detail & Related papers (2024-11-03T02:38:43Z) - Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement [29.675650285351768]
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks.
Approximate MU is a practical method for large-scale models.
We propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction.
arXiv Detail & Related papers (2024-09-29T15:17:33Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - A Closer Look at Few-Shot Video Classification: A New Baseline and
Benchmark [33.86872697028233]
We present an in-depth study on few-shot video classification by making three contributions.
First, we perform a consistent comparative study on the existing metric-based methods to figure out their limitations in representation learning.
Second, we discover that there is a high correlation between the novel action class and the ImageNet object class, which is problematic in the few-shot recognition setting.
Third, we present a new benchmark with more base data to facilitate future few-shot video classification without pre-training.
arXiv Detail & Related papers (2021-10-24T06:01:46Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - A Primal-Dual Subgradient Approachfor Fair Meta Learning [23.65344558042896]
Few shot meta-learning is well-known with its fast-adapted capability and accuracy generalization onto unseen tasks.
We propose a Primal-Dual Fair Meta-learning framework, namely PDFM, which learns to train fair machine learning models using only a few examples.
arXiv Detail & Related papers (2020-09-26T19:47:38Z) - Rethinking Few-Shot Image Classification: a Good Embedding Is All You
Need? [72.00712736992618]
We show that a simple baseline: learning a supervised or self-supervised representation on the meta-training set, outperforms state-of-the-art few-shot learning methods.
An additional boost can be achieved through the use of self-distillation.
We believe that our findings motivate a rethinking of few-shot image classification benchmarks and the associated role of meta-learning algorithms.
arXiv Detail & Related papers (2020-03-25T17:58:42Z) - Incremental Object Detection via Meta-Learning [77.55310507917012]
We propose a meta-learning approach that learns to reshape model gradients, such that information across incremental tasks is optimally shared.
In comparison to existing meta-learning methods, our approach is task-agnostic, allows incremental addition of new-classes and scales to high-capacity models for object detection.
arXiv Detail & Related papers (2020-03-17T13:40:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.