Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction
- URL: http://arxiv.org/abs/2212.10823v3
- Date: Wed, 31 May 2023 06:04:33 GMT
- Title: Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction
- Authors: Wenxuan Zhou, Sheng Zhang, Tristan Naumann, Muhao Chen, Hoifung Poon
- Abstract summary: Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
- Score: 34.76128090845668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Relation extraction (RE), which has relied on structurally annotated corpora
for model training, has been particularly challenging in low-resource scenarios
and domains. Recent literature has tackled low-resource RE by self-supervised
learning, where the solution involves pretraining the entity pair embedding by
RE-based objective and finetuning on labeled data by classification-based
objective. However, a critical challenge to this approach is the gap in
objectives, which prevents the RE model from fully utilizing the knowledge in
pretrained representations. In this paper, we aim at bridging the gap and
propose to pretrain and finetune the RE model using consistent objectives of
contrastive learning. Since in this kind of representation learning paradigm,
one relation may easily form multiple clusters in the representation space, we
further propose a multi-center contrastive loss that allows one relation to
form multiple clusters to better align with pretraining. Experiments on two
document-level RE datasets, BioRED and Re-DocRED, demonstrate the effectiveness
of our method. Particularly, when using 1% end-task training data, our method
outperforms PLM-based RE classifier by 10.5% and 6.1% on the two datasets,
respectively.
Related papers
- ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - Anchor-aware Deep Metric Learning for Audio-visual Retrieval [11.675472891647255]
Metric learning aims at capturing the underlying data structure and enhancing the performance of tasks like audio-visual cross-modal retrieval (AV-CMR)
Recent works employ sampling methods to select impactful data points from the embedding space during training.
However, the model training fails to fully explore the space due to the scarcity of training data points.
We propose an innovative Anchor-aware Deep Metric Learning (AADML) method to address this challenge.
arXiv Detail & Related papers (2024-04-21T22:44:44Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Enhancing Low-Resource Relation Representations through Multi-View Decoupling [21.32064890807893]
We propose a novel prompt-based relation representation method, named MVRE.
MVRE decouples each relation into different perspectives to encompass multi-view relation representations.
Our method can achieve state-of-the-art in low-resource settings.
arXiv Detail & Related papers (2023-12-26T14:16:16Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study [51.33182775762785]
This paper presents an empirical study to build relation extraction systems in low-resource settings.
We investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; and (iii) data augmentation technologies and self-training to generate more labeled in-domain data.
arXiv Detail & Related papers (2022-10-19T15:46:37Z) - Automatically Generating Counterfactuals for Relation Exaction [18.740447044960796]
relation extraction (RE) is a fundamental task in natural language processing.
Current deep neural models have achieved high accuracy but are easily affected by spurious correlations.
We develop a novel approach to derive contextual counterfactuals for entities.
arXiv Detail & Related papers (2022-02-22T04:46:10Z) - Robust Locality-Aware Regression for Labeled Data Classification [5.432221650286726]
We propose a new discriminant feature extraction framework, namely Robust Locality-Aware Regression (RLAR)
In our model, we introduce a retargeted regression to perform the marginal representation learning adaptively instead of using the general average inter-class margin.
To alleviate the disturbance of outliers and prevent overfitting, we measure the regression term and locality-aware term together with the regularization term by the L2,1 norm.
arXiv Detail & Related papers (2020-06-15T11:36:59Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.