Related papers: Understanding Programmatic Weak Supervision via Source-aware Influence Function

Understanding Programmatic Weak Supervision via Source-aware Influence Function

URL: http://arxiv.org/abs/2205.12879v1
Date: Wed, 25 May 2022 15:57:24 GMT
Title: Understanding Programmatic Weak Supervision via Source-aware Influence Function
Authors: Jieyu Zhang, Haonan Wang, Cheng-Yu Hsieh, Alexander Ratner
Abstract summary: Programmatic Weak Supervision (PWS) aggregates the source votes of multiple weak supervision sources into probabilistic training labels. We build on Influence Function (IF) to decompose the end model's training objective and then calculate the influence associated with each (data, source, class) These primitive influence score can then be used to estimate the influence of individual component PWS, such as source vote, supervision source, and training data.
Score: 76.74549130841383
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Programmatic Weak Supervision (PWS) aggregates the source votes of multiple weak supervision sources into probabilistic training labels, which are in turn used to train an end model. With its increasing popularity, it is critical to have some tool for users to understand the influence of each component (e.g., the source vote or training data) in the pipeline and interpret the end model behavior. To achieve this, we build on Influence Function (IF) and propose source-aware IF, which leverages the generation process of the probabilistic labels to decompose the end model's training objective and then calculate the influence associated with each (data, source, class) tuple. These primitive influence score can then be used to estimate the influence of individual component of PWS, such as source vote, supervision source, and training data. On datasets of diverse domains, we demonstrate multiple use cases: (1) interpreting incorrect predictions from multiple angles that reveals insights for debugging the PWS pipeline, (2) identifying mislabeling of sources with a gain of 9%-37% over baselines, and (3) improving the end model's generalization performance by removing harmful components in the training objective (13%-24% better than ordinary IF).

Related papers

Global Intervention and Distillation for Federated Out-of-Distribution Generalization [7.905159090314987]
Attribute skew in federated learning leads local models to focus on learning non-causal associations. This paper presents FedGID, which utilizes diverse attribute features for backdoor adjustment to break the spurious association between background and label. Experimental results on three datasets demonstrate that FedGID enhances the model's ability to focus on the main subjects in unseen data.
arXiv Detail & Related papers (2025-04-01T14:36:24Z)
Variational Bayesian Personalized Ranking [39.24591060825056]
Variational BPR is a novel and easily implementable learning objective that integrates likelihood optimization, noise reduction, and popularity debiasing. We introduce an attention-based latent interest prototype contrastive mechanism, replacing instance-level contrastive learning, to effectively reduce noise from problematic samples. Empirically, we demonstrate the effectiveness of Variational BPR on popular backbone recommendation models.
arXiv Detail & Related papers (2025-03-14T04:22:01Z)
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities [15.92881751491451]
Influence-based methods show promise in achieving (1) by estimating the contribution of each training example to the model's predictions, but often struggle with (2). Our systematic investigation reveals that this underperformance can be attributed to an inherent bias where certain tasks intrinsically have greater influence than others. As a result, data selection is often biased towards these tasks, not only hurting the model's performance on others but also, counterintuitively, harms performance on these high-influence tasks themselves.
arXiv Detail & Related papers (2025-01-21T14:00:43Z)
A Versatile Influence Function for Data Attribution with Non-Decomposable Loss [3.1615846013409925]
We propose a Versatile Influence Function (VIF) that can be straightforwardly applied to machine learning models trained with any non-decomposable loss. VIF represents a significant advancement in data attribution, enabling efficient influence-function-based attribution across a wide range of machine learning paradigms.
arXiv Detail & Related papers (2024-12-02T09:59:01Z)
In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models [37.45103473809928]
We propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and evaluation samples with a trained model. By applying our algorithm to instruction fine-tuning data of LLMs, we can achieve similar performance with just 50% of the training data.
arXiv Detail & Related papers (2024-08-07T05:48:05Z)
Pre-training by Predicting Program Dependencies for Vulnerability Analysis Tasks [12.016029378106131]
This work proposes two novel pre-training objectives, namely Control Dependency Prediction (CDP) and Data Dependency Prediction (DDP) CDP and DDP aim to predict the statement-level control dependencies and token-level data dependencies, respectively, in a code snippet only based on its source code. After pre-training, CDP and DDP can boost the understanding of vulnerable code during fine-tuning and can directly be used to perform dependence analysis for both partial and complete functions.
arXiv Detail & Related papers (2024-02-01T15:18:19Z)
Influence Scores at Scale for Efficient Language Data Sampling [3.072340427031969]
"influence scores" are used to identify important subsets of data. In this paper, we explore the applicability of influence scores in language classification tasks.
arXiv Detail & Related papers (2023-11-27T20:19:22Z)
Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning [89.21177894013225]
For a federated learning model to perform well, it is crucial to have a diverse and representative dataset. We show that the statistical criterion used to quantify the diversity of the data, as well as the choice of the federated learning algorithm used, has a significant effect on the resulting equilibrium. We leverage this to design simple optimal federated learning mechanisms that encourage data collectors to contribute data representative of the global population.
arXiv Detail & Related papers (2023-06-08T23:38:25Z)
Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models [3.9052860539161918]
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature. We assess the robustness towards a large set of known and newly found prediction biases for various pre-trained models and debiasing methods in Question Answering (QA) We find that while existing debiasing methods can mitigate reliance on a chosen spurious feature, the OOD performance gains of these methods can not be explained by mitigated reliance on biased features.
arXiv Detail & Related papers (2023-05-11T14:35:00Z)
Supervised Contrastive Learning for Affect Modelling [2.570570340104555]
We introduce three different supervised contrastive learning approaches for training representations that consider affect information. Results demonstrate the representation capacity of contrastive learning and its efficiency in boosting the accuracy of affect models.
arXiv Detail & Related papers (2022-08-25T17:40:19Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Toward Understanding the Influence of Individual Clients in Federated Learning [52.07734799278535]
Federated learning allows clients to jointly train a global model without sending their private data to a central server. We defined a new notion called em-Influence, quantify this influence over parameters, and proposed an effective efficient model to estimate this metric.
arXiv Detail & Related papers (2020-12-20T14:34:36Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.