Structured Language Generation Model for Robust Structure Prediction
- URL: http://arxiv.org/abs/2402.08971v2
- Date: Mon, 19 Feb 2024 00:06:41 GMT
- Title: Structured Language Generation Model for Robust Structure Prediction
- Authors: Minho Lee and Junghyun Min and Woochul Lee and Yeonsoo Lee
- Abstract summary: We propose a framework that reduces sequence-to-sequence problems to classification problems via methodologies in loss calibration and decoding method.
Our experimental results show that SLGM is able to maintain performance without explicit dataset information, follow and potentially replace dataset-specific fine-tuning.
- Score: 6.4736137270915215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work in structured prediction (e.g. NER, information extraction)
using single model make use of explicit dataset information, which helps boost
in-distribution performance but is orthogonal to robust generalization in
real-world situations. To overcome this limitation, we propose the Structured
Language Generation Model (SLGM), a framework that reduces sequence-to-sequence
problems to classification problems via methodologies in loss calibration and
decoding method. Our experimental results show that SLGM is able to maintain
performance without explicit dataset information, follow and potentially
replace dataset-specific fine-tuning.
Related papers
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL [8.57550491437633]
This work investigates the linear handling of structured data in encoder-decoder language models, specifically T5.
Our findings reveal the model's ability to mimic human-designed processes such as schema linking and syntax prediction.
We also uncover insights into the model's internal mechanisms, including the ego-centric nature of structure node encodings.
arXiv Detail & Related papers (2024-04-03T01:16:20Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Invariant Structure Learning for Better Generalization and Causal
Explainability [44.580704853704994]
We propose a novel framework, Invariant Structure Learning (ISL), to improve causal structure discovery.
ISL splits the data into different environments, and learns a structure that is invariant to the target across different environments.
We demonstrate that ISL accurately discovers the causal structure, outperforms alternative methods, and yields superior generalization for datasets with significant distribution shifts.
arXiv Detail & Related papers (2022-06-13T21:04:23Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - How Does Data Corruption Affect Natural Language Understanding Models? A
Study on GLUE datasets [4.645287693363387]
We show that performance remains high for most GLUE tasks when the models are fine-tuned or tested on corrupted data.
Our proposed data transformations can be used as a diagnostic tool for assessing the extent to which a specific dataset constitutes a proper testbed for evaluating models' language understanding capabilities.
arXiv Detail & Related papers (2022-01-12T13:35:53Z) - Improving Compositional Generalization with Self-Training for
Data-to-Text Generation [36.973617793800315]
We study the compositional generalization of current generation models in data-to-text tasks.
By simulating structural shifts in the compositional Weather dataset, we show that T5 models fail to generalize to unseen structures.
We propose an approach based on self-training using finetuned BLEURT for pseudo-response selection.
arXiv Detail & Related papers (2021-10-16T04:26:56Z) - CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables.
CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.