Semantic Parsing in Limited Resource Conditions
- URL: http://arxiv.org/abs/2309.07429v1
- Date: Thu, 14 Sep 2023 05:03:09 GMT
- Title: Semantic Parsing in Limited Resource Conditions
- Authors: Zhuang Li
- Abstract summary: The thesis explores challenges in semantic parsing, specifically focusing on scenarios with limited data and computational resources.
It offers solutions using techniques like automatic data curation, knowledge transfer, active learning, and continual learning.
- Score: 19.689433249830465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This thesis explores challenges in semantic parsing, specifically focusing on
scenarios with limited data and computational resources. It offers solutions
using techniques like automatic data curation, knowledge transfer, active
learning, and continual learning.
For tasks with no parallel training data, the thesis proposes generating
synthetic training examples from structured database schemas. When there is
abundant data in a source domain but limited parallel data in a target domain,
knowledge from the source is leveraged to improve parsing in the target domain.
For multilingual situations with limited data in the target languages, the
thesis introduces a method to adapt parsers using a limited human translation
budget. Active learning is applied to select source-language samples for manual
translation, maximizing parser performance in the target language. In addition,
an alternative method is also proposed to utilize machine translation services,
supplemented by human-translated data, to train a more effective parser.
When computational resources are limited, a continual learning approach is
introduced to minimize training time and computational memory. This maintains
the parser's efficiency in previously learned tasks while adapting it to new
tasks, mitigating the problem of catastrophic forgetting.
Overall, the thesis provides a comprehensive set of methods to improve
semantic parsing in resource-constrained conditions.
Related papers
- No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement [59.37775534633868]
We introduce a novel method called language arithmetic, which enables training-free post-processing.
The effectiveness of the proposed solution is demonstrated on three downstream tasks in a MAD-X-based set of cross-lingual schemes.
arXiv Detail & Related papers (2024-04-24T08:52:40Z) - Parameterizing Context: Unleashing the Power of Parameter-Efficient
Fine-Tuning and In-Context Tuning for Continual Table Semantic Parsing [13.51721352349583]
This paper introduces a novel method integrating textitcontext-efficient fine-tuning (PEFT) and textitin-adaptive tuning (ICT) for training a continual table semantic parsing.
The teacher addresses the few-shot problem using ICT, which procures contextual information by demonstrating a few training examples.
In turn, the student leverages the proposed PEFT framework to learn from the teacher's output distribution, and subsequently compresses and saves the contextual information to the prompts, eliminating the need to store any training examples.
arXiv Detail & Related papers (2023-10-07T13:40:41Z) - Effective Transfer Learning for Low-Resource Natural Language
Understanding [15.752309656576129]
We focus on developing cross-lingual and cross-domain methods to tackle the low-resource issues.
First, we propose to improve the model's cross-lingual ability by focusing on the task-related keywords.
Second, we present Order-Reduced Modeling methods for the cross-lingual adaptation.
Third, we propose to leverage different levels of domain-related corpora and additional masking of data in the pre-training for the cross-domain adaptation.
arXiv Detail & Related papers (2022-08-19T06:59:00Z) - Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - On the Usability of Transformers-based models for a French
Question-Answering task [2.44288434255221]
This paper focuses on the usability of Transformer-based language models in small-scale learning problems.
We introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.
arXiv Detail & Related papers (2022-07-19T09:46:15Z) - Improving Meta-learning for Low-resource Text Classification and
Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation.
A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z) - Using Document Similarity Methods to create Parallel Datasets for Code
Translation [60.36392618065203]
Translating source code from one programming language to another is a critical, time-consuming task.
We propose to use document similarity methods to create noisy parallel datasets of code.
We show that these models perform comparably to models trained on ground truth for reasonable levels of noise.
arXiv Detail & Related papers (2021-10-11T17:07:58Z) - Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources.
Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages.
We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.