WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised
Span Prediction
- URL: http://arxiv.org/abs/2306.05644v2
- Date: Thu, 19 Oct 2023 05:47:52 GMT
- Title: WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised
Span Prediction
- Authors: Qiyu Wu, Masaaki Nagata, Yoshimasa Tsuruoka
- Abstract summary: Most existing word alignment methods rely on manual alignment datasets or parallel corpora.
We relax the requirement for correct, fully-aligned, and parallel sentences.
We then use such a large-scale weakly-supervised dataset for word alignment pre-training via span prediction.
- Score: 31.96433679860807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing word alignment methods rely on manual alignment datasets or
parallel corpora, which limits their usefulness. Here, to mitigate the
dependence on manual data, we broaden the source of supervision by relaxing the
requirement for correct, fully-aligned, and parallel sentences. Specifically,
we make noisy, partially aligned, and non-parallel paragraphs. We then use such
a large-scale weakly-supervised dataset for word alignment pre-training via
span prediction. Extensive experiments with various settings empirically
demonstrate that our approach, which is named WSPAlign, is an effective and
scalable way to pre-train word aligners without manual data. When fine-tuned on
standard benchmarks, WSPAlign has set a new state-of-the-art by improving upon
the best-supervised baseline by 3.3~6.1 points in F1 and 1.5~6.1 points in AER.
Furthermore, WSPAlign also achieves competitive performance compared with the
corresponding baselines in few-shot, zero-shot and cross-lingual tests, which
demonstrates that WSPAlign is potentially more practical for low-resource
languages than existing methods.
Related papers
- SAIL: Self-Improving Efficient Online Alignment of Large Language Models [56.59644677997827]
Reinforcement Learning from Human Feedback is a key method for aligning large language models with human preferences.
Recent literature has focused on designing online RLHF methods but still lacks a unified conceptual formulation.
Our approach significantly improves alignment performance on open-sourced datasets with minimal computational overhead.
arXiv Detail & Related papers (2024-06-21T18:05:35Z) - Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback [70.32795295142648]
Linear alignment is a novel algorithm that aligns language models with human preferences in one single inference step.
Experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment.
arXiv Detail & Related papers (2024-01-21T10:46:23Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)
It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.
Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - Constrained Density Matching and Modeling for Cross-lingual Alignment of
Contextualized Representations [27.74320705109685]
We introduce supervised and unsupervised density-based approaches named Real-NVP and GAN-Real-NVP, driven by Normalizing Flow, to perform alignment.
Our experiments encompass 16 alignments, including our approaches, evaluated across 6 language pairs, synthetic data and 4 NLP tasks.
arXiv Detail & Related papers (2022-01-31T18:41:28Z) - Word Alignment by Fine-tuning Embeddings on Parallel Corpora [96.28608163701055]
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs.
Recently, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data.
In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing
arXiv Detail & Related papers (2021-01-20T17:54:47Z) - Cross-lingual Alignment Methods for Multilingual BERT: A Comparative
Study [2.101267270902429]
We analyse how different forms of cross-lingual supervision and various alignment methods influence the transfer capability of mBERT in zero-shot setting.
We find that supervision from parallel corpus is generally superior to dictionary alignments.
arXiv Detail & Related papers (2020-09-29T20:56:57Z) - A Supervised Word Alignment Method based on Cross-Language Span
Prediction using Multilingual BERT [22.701728185474195]
We first formalize a word alignment problem as a collection of independent predictions from a token in the source sentence to a span in the target sentence.
We then solve this problem by using multilingual BERT, which is fine-tuned on a manually created gold word alignment data.
We show that the proposed method significantly outperformed previous supervised and unsupervised word alignment methods without using any bitexts for pretraining.
arXiv Detail & Related papers (2020-04-29T23:40:08Z) - SimAlign: High Quality Word Alignments without Parallel Training Data
using Static and Contextualized Embeddings [3.8424737607413153]
We propose word alignment methods that require no parallel data.
Key idea is to leverage multilingual word embeddings, both static and contextualized, for word alignment.
We find that alignments created from embeddings are superior for two language pairs compared to those produced by traditional statistical methods.
arXiv Detail & Related papers (2020-04-18T23:10:36Z) - Multilingual Alignment of Contextual Word Representations [49.42244463346612]
BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model.
We introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer.
These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.
arXiv Detail & Related papers (2020-02-10T03:27:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.