Related papers: FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information

FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information

URL: http://arxiv.org/abs/2411.05752v1
Date: Fri, 08 Nov 2024 18:10:46 GMT
Title: FisherMask: Enhancing Neural Network Labeling Efficiency in Image Classification Using Fisher Information
Authors: Shreen Gul, Mohamed Elmahallawy, Sanjay Madria, Ardhendu Tripathy,
Abstract summary: FisherMask is a Fisher information-based active learning (AL) approach that identifies key network parameters by masking them. Our experiments demonstrate that FisherMask significantly outperforms state-of-the-art methods on diverse datasets.
Score: 2.762397703396293
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning (DL) models are popular across various domains due to their remarkable performance and efficiency. However, their effectiveness relies heavily on large amounts of labeled data, which are often time-consuming and labor-intensive to generate manually. To overcome this challenge, it is essential to develop strategies that reduce reliance on extensive labeled data while preserving model performance. In this paper, we propose FisherMask, a Fisher information-based active learning (AL) approach that identifies key network parameters by masking them based on their Fisher information values. FisherMask enhances batch AL by using Fisher information to select the most critical parameters, allowing the identification of the most impactful samples during AL training. Moreover, Fisher information possesses favorable statistical properties, offering valuable insights into model behavior and providing a better understanding of the performance characteristics within the AL pipeline. Our extensive experiments demonstrate that FisherMask significantly outperforms state-of-the-art methods on diverse datasets, including CIFAR-10 and FashionMNIST, especially under imbalanced settings. These improvements lead to substantial gains in labeling efficiency. Hence serving as an effective tool to measure the sensitivity of model parameters to data samples. Our code is available on \url{https://github.com/sgchr273/FisherMask}.

Related papers

Efficient Data Selection at Scale via Influence Distillation [53.03573620682107]
This paper introduces Influence Distillation, a mathematicallyjustified framework for data selection.<n>By distilling each sample's influence on a target distribution, our method assigns model-specific weights that are used to select training data.<n>Experiments show that Influence Distillation matches or outperforms state-of-the-art performance while achieving up to $3.5times$ faster selection.
arXiv Detail & Related papers (2025-05-25T09:08:00Z)
Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language Models [6.57101653042078]
Generalized Fisher-Weighted SVD (GFWSVD) is a post-training compression technique that accounts for both diagonal and off-diagonal elements of the Fisher information matrix.<n>We demonstrate the effectiveness of our method on LLM compression, showing improvements over existing compression baselines.
arXiv Detail & Related papers (2025-05-23T14:41:52Z)
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding [61.15402517835137]
We build a supervised fine-tuning (SFT) dataset to achieve state-of-the-art coding capability results in models of various sizes. Our models use only SFT to achieve 61.8% on LiveCodeBench and 24.6% on CodeContests, surpassing alternatives trained with reinforcement learning.
arXiv Detail & Related papers (2025-04-02T17:50:31Z)
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Parity LLM Data Valuation [11.36712576361739]
Large Language Models (LLMs) heavily rely on high-quality training data, making data valuation crucial for optimizing model performance. We introduce a linearized future influence kernel (LinFiK), which assesses the value of individual data samples. We propose ALinFiK, a learning strategy to approximate LinFiK, enabling scalable data valuation.
arXiv Detail & Related papers (2025-03-02T22:51:12Z)
Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification [1.8567173419246403]
Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks. BAIT, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets. This paper introduces two methods to enhance BAIT's computational efficiency and scalability.
arXiv Detail & Related papers (2024-04-13T12:09:37Z)
Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets. dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset. We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z)
Unlearning with Fisher Masking [20.763692349949245]
Machine unlearning aims to revoke some training data after learning in response to requests from users, model developers, and administrators. Most previous methods are based on direct fine-tuning, which may neither remove data completely nor retain full performances on the remain data. We propose a new masking strategy tailored to unlearning based on Fisher information.
arXiv Detail & Related papers (2023-10-09T01:24:06Z)
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models [31.65198592956842]
We propose DataInf, an efficient influence approximation method that is practical for large-scale generative AI models. Our theoretical analysis shows that DataInf is particularly well-suited for parameter-efficient fine-tuning techniques such as LoRA. In applications to RoBERTa-large, Llama-2-13B-chat, and stable-diffusion-v1.5 models, DataInf effectively identifies the most influential fine-tuning examples better than other approximate influence scores.
arXiv Detail & Related papers (2023-10-02T04:59:19Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting. We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem. Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z)
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL) In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Measuring Data Leakage in Machine-Learning Models with Fisher Information [35.20523017255285]
Machine-learning models contain information about the data they were trained on. This information leaks either through the model itself or through predictions made by the model. We propose a method to quantify this leakage using the Fisher information of the model about the data.
arXiv Detail & Related papers (2021-02-23T13:02:34Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.