Cross-Domain Generalization and Knowledge Transfer in Transformers
Trained on Legal Data
- URL: http://arxiv.org/abs/2112.07870v1
- Date: Wed, 15 Dec 2021 04:23:14 GMT
- Title: Cross-Domain Generalization and Knowledge Transfer in Transformers
Trained on Legal Data
- Authors: Jaromir Savelka, Hannes Westermann, Karim Benyekhlef
- Abstract summary: We analyze the ability of pre-trained language models to transfer knowledge among datasets annotated with different type systems.
Prediction of the rhetorical role a sentence plays in a case decision is an important and often studied task in AI & Law.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyze the ability of pre-trained language models to transfer knowledge
among datasets annotated with different type systems and to generalize beyond
the domain and dataset they were trained on. We create a meta task, over
multiple datasets focused on the prediction of rhetorical roles. Prediction of
the rhetorical role a sentence plays in a case decision is an important and
often studied task in AI & Law. Typically, it requires the annotation of a
large number of sentences to train a model, which can be time-consuming and
expensive. Further, the application of the models is restrained to the same
dataset it was trained on. We fine-tune language models and evaluate their
performance across datasets, to investigate the models' ability to generalize
across domains. Our results suggest that the approach could be helpful in
overcoming the cold-start problem in active or interactvie learning, and shows
the ability of the models to generalize across datasets and domains.
Related papers
- Adapting Large Language Models to Domains via Reading Comprehension [86.24451681746676]
We explore how continued pre-training on domain-specific corpora influences large language models.
We show that training on the raw corpora endows the model with domain knowledge, but drastically hurts its ability for question answering.
We propose a simple method for transforming raw corpora into reading comprehension texts.
arXiv Detail & Related papers (2023-09-18T07:17:52Z) - Visual Explanations with Attributions and Counterfactuals on Time Series
Classification [15.51135925107216]
We propose a visual analytics workflow to support seamless transitions between global and local explanations.
To generate a global overview, we apply local attribution methods to the data, creating explanations for the whole dataset.
To further inspect the model decision-making as well as potential data errors, a what-if analysis facilitates hypothesis generation and verification.
arXiv Detail & Related papers (2023-07-14T10:01:30Z) - A Data Fusion Framework for Multi-Domain Morality Learning [3.0671872389903547]
We describe a data fusion framework for training on multiple heterogeneous datasets.
The proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.
arXiv Detail & Related papers (2023-04-04T22:05:02Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Pre-Training a Graph Recurrent Network for Language Representation [34.4554387894105]
We consider a graph recurrent network for language model pre-training, which builds a graph structure for each sequence with local token-level communications.
We find that our model can generate more diverse outputs with less contextualized feature redundancy than existing attention-based models.
arXiv Detail & Related papers (2022-09-08T14:12:15Z) - QAGAN: Adversarial Approach To Learning Domain Invariant Language
Features [0.76146285961466]
We explore adversarial training approach towards learning domain-invariant features.
We are able to achieve $15.2%$ improvement in EM score and $5.6%$ boost in F1 score on out-of-domain validation dataset.
arXiv Detail & Related papers (2022-06-24T17:42:18Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.