Automatic Rule Induction for Efficient Semi-Supervised Learning
- URL: http://arxiv.org/abs/2205.09067v3
- Date: Fri, 20 May 2022 16:42:21 GMT
- Title: Automatic Rule Induction for Efficient Semi-Supervised Learning
- Authors: Reid Pryzant, Ziyi Yang, Yichong Xu, Chenguang Zhu, Michael Zeng
- Abstract summary: Semi-supervised learning has shown promise in allowing NLP models to generalize from small amounts of labeled data.
Pretrained transformer models act as black-box correlation engines that are difficult to explain and sometimes behave unreliably.
We propose tackling both of these challenges via Automatic Rule Induction (ARI), a simple and general-purpose framework.
- Score: 56.91428251227253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning has shown promise in allowing NLP models to
generalize from small amounts of labeled data. Meanwhile, pretrained
transformer models act as black-box correlation engines that are difficult to
explain and sometimes behave unreliably. In this paper, we propose tackling
both of these challenges via Automatic Rule Induction (ARI), a simple and
general-purpose framework for the automatic discovery and integration of
symbolic rules into pretrained transformer models. First, we extract weak
symbolic rules from low-capacity machine learning models trained on small
amounts of labeled data. Next, we use an attention mechanism to integrate these
rules into high-capacity pretrained transformer models. Last, the
rule-augmented system becomes part of a self-training framework to boost
supervision signal on unlabeled data. These steps can be layered beneath a
variety of existing weak supervision and semi-supervised NLP algorithms in
order to improve performance and interpretability. Experiments across nine
sequence classification and relation extraction tasks suggest that ARI can
improve state-of-the-art methods with no manual effort and minimal
computational overhead.
Related papers
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - A General Framework for Learning from Weak Supervision [93.89870459388185]
This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm.
Central to GLWS is an Expectation-Maximization (EM) formulation, adeptly accommodating various weak supervision sources.
We also present an advanced algorithm that significantly simplifies the EM computational demands.
arXiv Detail & Related papers (2024-02-02T21:48:50Z) - FaultFormer: Pretraining Transformers for Adaptable Bearing Fault Classification [7.136205674624813]
We present a novel self-supervised pretraining and fine-tuning framework based on transformer models.
In particular, we investigate different tokenization and data augmentation strategies to reach state-of-the-art accuracies.
This introduces a new paradigm where models can be pretrained on unlabeled data from different bearings, faults, and machinery and quickly deployed to new, data-scarce applications.
arXiv Detail & Related papers (2023-12-04T22:51:02Z) - Uncovering mesa-optimization algorithms in Transformers [61.06055590704677]
Some autoregressive models can learn as an input sequence is processed, without undergoing any parameter changes, and without being explicitly trained to do so.
We show that standard next-token prediction error minimization gives rise to a subsidiary learning algorithm that adjusts the model as new inputs are revealed.
Our findings explain in-context learning as a product of autoregressive loss minimization and inform the design of new optimization-based Transformer layers.
arXiv Detail & Related papers (2023-09-11T22:42:50Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Trained Transformers Learn Linear Models In-Context [39.56636898650966]
Attention-based neural networks as transformers have demonstrated a remarkable ability to exhibit inattention learning (ICL)
We show that when transformer training over random instances of linear regression problems, these models' predictions mimic nonlinear of ordinary squares.
arXiv Detail & Related papers (2023-06-16T15:50:03Z) - Pseudo-Label Training and Model Inertia in Neural Machine Translation [18.006833174265612]
neural machine translation (NMT) models are sensitive to small input changes and can show significant variation across re-training or incremental model updates.
This work studies a frequently used method in NMT, pseudo-label training (PLT), which is common to the related techniques of forwardtranslation or self-training.
While the effect of quality is well-documented, we highlight a lesser-known effect:PL can enhance a model's stability to model updates and input perturbations.
arXiv Detail & Related papers (2023-05-19T16:45:19Z) - Semi-WTC: A Practical Semi-supervised Framework for Attack
Categorization through Weight-Task Consistency [19.97236038722335]
Supervised learning has been widely used for attack detection, which requires large amounts of high-quality data and labels.
We propose a semi-supervised fine-grained attack categorization framework consisting of an encoder and a two-branch structure.
We show that our model outperforms the state-of-the-art semi-supervised attack detection methods with a general 5% improvement in classification accuracy and a 90% reduction in training time.
arXiv Detail & Related papers (2022-05-19T16:30:31Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.